DerekSchaefer.NET I do stuff, you read about it!

26Aug/092

Google Summer of Code, Pt. 17 [FINAL]

Greetings all, this will be the last of my required updates. Now, that does not mean I will stop posting (about phpMyAdmin or otherwise), but rather this will be the last of the official "Google Summer of Code, Pt. # [ _____ ]" sequence of GSoC related updates and status reports. Nor does it spell the end of my contributions to phpMyAdmin, although the amount of time I will be able to contribute will be cut back drastically (read: no longer ~25hrs/week, or whatever the average actually was), what with school starting up again in a few weeks, work (on campus job), and other collegiate activities, but I will spend as much time as is reasonable contributing to PMA.

Last week I successfully merged my branch back into the trunk, but without the glorious automation of "svn merge" unfortunately. Ultimately, I had to do it the old fashion way by copying and moving files around, and using diff/patch where necessary. It wasn't a terribly painful or drawn out experience, but it would have been nice if SVN had worked as advertised. Now, whether the fault was my own for somehow messing up my branch along the way so that it couldn't be merged, I don't know. I tried methods listed for every major release of SVN, and none of them worked correctly. For what it's worth, keeping my branch in sync with the trunk was incredibly easy.

I hope you all make good use of my new import/export additions. Even though GSoC is over, I want everyone to know that I still welcome any suggetions or critiques regarding my changes. Don't hesitate to request a new import/export modules, addition featuers for existing ones, or really anything else. I'll be around. =D

I have a few changes that I have been waiting to make until everyone was done merging their branches back into the trunk, as I didn't want to cause additional conflicts. I believe that everyone is done with their merging at this point, so I will be applying these changes soon.

It's been great working with all of you! And to my fellow students, I hope that you'll stick around and keeping contributing to phpMyAdmin as well.

Tagged as: , 2 Comments
17Aug/090

Google Summer of Code, Pt. 16 [Week 9] – Final evaluations, ahoy!

Key Accomplishments Last Week/This Week:

  • Fixed the bug that was preventing the successful import information panel from displaying after completing an import action that made use of the type-detection API.
  • Finished adding all applicable strings to the localization files with add_message.sh
  • Added more documentation, both to the source and to the phpMyAdmin wiki.
  • Minor optimizations and bug fixes.

Key Tasks that Stalled Last Week:

  • None

Key Concerns:

  • That my branch merges without complication.

Tasks in the Upcoming Week:

  • Merging my branch back into the trunk.
  • Making sure all of my changes integrate smoothly and correctly.
  • Polishing documentation.
6Aug/090

Google Summer of Code, Pt. 15 [Week 8] – Almost there!

Key Accomplishments Last Week/This Week:

  • Completed both the XLS and XLSX import AND export modules, which means I am done... and then some!
  • Minor bugfixes in the import library.

Key Tasks that Stalled Last Week:

  • For the longest time, I wasn't able to find an XLS reader library with a compatible license, but as of this week, I have! w00t! It's called PHPExcel and can read/write XLS AND XLSX files!

Key Concerns:

  • Figuring out why my import information panel does not display with either the new XLS or XLSX import modules.

Tasks in the Upcoming Week:

  • Minor bugfixes and optimizations across the board.
  • Make sure all the code is formatted and documented correctly.
  • Make sure all applicable strings are in the localization file.
  • Append information on how to use my new MySQL type-detection functions to the import wiki page.
31Jul/090

Google Summer of Code, Pt. 14 [ODS Import/Export Plug-ins]

Open Document Spreadsheet

good_times.ods

I talked about posting this some time ago, but hadn't gotten around to it as I haven't been feeling well recently. Luckily that's passed, so here we go...

My Open Document Spreadsheet module has come a long way in the last couple weeks. It is now fully capable of:

  • Importing into an existing database
  • Importing any number of sheets
  • Determining MySQL data types for each column
  • Creating the tables and inserting the data into the selected database
  • Extracting column names from the first row of each table
  • Replicating the "Excel-style" column names (A, B, ..., Z, AA, AB, ...)
  • Importing percentages as proper decimals (12.00% -> 0.12)
  • Importing currencies as proper decimals ($5.75 -> 5.75)
  • Ignoring empty rows
  • And more...

There are still a few minor issues to be sorted out, but nothing that is terribly important.

In addition to all this, I have several potential ideas that are almost certainly beyond the scope of the Summer of Code due to their complexity. I would like to propose these here, but I will also be making a duplicate post to the PMA devel mailing list so it reaches the most eyes. It's possible that these ideas are too ambitious compared with their usefulness to the greater phpMyAdmin community, but I could see them becoming a "selling-point,"  if you will, as there is nothing quite like this in existence, as far as I have been able to devise.

The end goal would be to provide a means of importing and exporting spreadsheets (both Open Office and Excel) in their entirety. Currently, we can now import and export spreadsheets, but only the tabular data within. If there are functions of any type in the spreadsheet, they are ignored, and only their current value is imported. For many purposes that is all one needs (or expects), but there are some cases where having these greater possibilities would save a considerable amount of time, and money, for some.

For example, imagine that you have a large Excel spreadsheet that tracks your small business's finances, complete with an army of calculated columns, rows, and other functions. Now imagine that your small business becomes, well, a medium business (or some other equivalent scenario) and you need to access your finances from different locations, make online transactions automatic, etc. The obvious solution would be to turn your spreadsheet-based solution into a true database, in this case MySQL. With the aid of phpMyAdmin, the user would be able to simply import the spreadsheet workbook(s) and all the data would be loaded, as well as all the calculations. The database would go on functioning just as the spreadsheets did before but on a greater scale, and the user would not have to waste time rewriting all the calculations as a MySQL compliant trigger, function, or procedure set.

A simple example would be:

spreadsheet

This type of setup can be directly translated into a MySQL database.

First, create the base table.

CREATE TABLE `Excel_Table` (
 `Product` varchar(25) NOT NULL,
 `Qtr_1` decimal(25,2) NOT NULL,
 `Qtr_2` decimal(25,2) NOT NULL,
 `Total` decimal(25,2) NULL,
 PRIMARY KEY (`Product`)
) ENGINE=MyISAM;

Next, create the table which will contain only the total row at the bottom.

CREATE TABLE `Excel_Table_Total` (
 `Index` varchar(25) NOT NULL,
 `Qtr_1` decimal(25,2) NULL,
 `Qtr_2` decimal(25,2) NULL,
 `Total` decimal(25,2) NULL,
 PRIMARY KEY (`Index`)
) ENGINE=MyISAM;

INSERT INTO `Excel_Table_Total`
    (`Index`, `Qtr_1`, `Qtr_2`, `Total`)
VALUES
    ('Total', NULL, NULL, NULL);

After the tables have been created we can begin implementing the triggers.

-- Handle INSERT changes

DELIMITER //
CREATE TRIGGER `calc_col_insert` BEFORE INSERT ON `Excel_Table`
    FOR EACH ROW BEGIN
        SET NEW.Total = NEW.Qtr_1 + NEW.Qtr_2;
    END;
DELIMITER ;

-- Handle UPDATE changes

DELIMITER //
CREATE TRIGGER `calc_col_update` BEFORE UPDATE ON `Excel_Table`
    FOR EACH ROW BEGIN
        SET NEW.Total = NEW.Qtr_1 + NEW.Qtr_2;
    END;
DELIMITER ;

And the triggers to handle the total row.

-- Handle INSERT changes

DELIMITER //
CREATE TRIGGER `total_row_insert` AFTER INSERT ON `Excel_Table`
    FOR EACH ROW BEGIN
        DECLARE col_sum DECIMAL(25,2);
        SET col_sum = (SELECT SUM(Qtr_1) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Qtr_1` = col_sum WHERE `Index` = 'Total';
        SET col_sum = (SELECT SUM(Qtr_2) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Qtr_2` = col_sum WHERE `Index` = 'Total';
        SET col_sum = (SELECT SUM(Total) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Total` = col_sum WHERE `Index` = 'Total';
    END;
DELIMITER ;

-- Handle UPDATE changes

DELIMITER //
CREATE TRIGGER `total_row_update` AFTER UPDATE ON `Excel_Table`
    FOR EACH ROW BEGIN
        DECLARE col_sum DECIMAL(25,2);
        SET col_sum = (SELECT SUM(Qtr_1) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Qtr_1` = col_sum WHERE `Index` = 'Total';
        SET col_sum = (SELECT SUM(Qtr_2) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Qtr_2` = col_sum WHERE `Index` = 'Total';
        SET col_sum = (SELECT SUM(Total) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Total` = col_sum WHERE `Index` = 'Total';
    END;
DELIMITER ;

-- Handle DELETE changes

DELIMITER //
CREATE TRIGGER `total_row_delete` AFTER DELETE ON `Excel_Table`
    FOR EACH ROW BEGIN
        DECLARE col_sum DECIMAL(25,2);
        SET col_sum = (SELECT SUM(Qtr_1) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Qtr_1` = col_sum WHERE `Index` = 'Total';
        SET col_sum = (SELECT SUM(Qtr_2) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Qtr_2` = col_sum WHERE `Index` = 'Total';
        SET col_sum = (SELECT SUM(Total) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Total` = col_sum WHERE `Index` = 'Total';
    END;
DELIMITER ;

Insert the example data.

INSERT INTO `Excel_Table` (`Product`, `Qtr_1`, `Qtr_2`) VALUES
('Prod 1', '744.60', '162.00'),
('Prod 2', '5079.20', '1249.63'),
('Prod 3', '1267.75', '1024.00'),
('Prod 4', '1408.50', '1756.84'),
('Prod 5', '4728.36', '756.42'),
('Prod 6', '943.89', '349.60');

And finally, we create the view to tie everything together.

CREATE VIEW `Excel_Table_View` AS
    SELECT * FROM `Excel_Table`
        UNION
    SELECT * FROM `Excel_Table_Total` WHERE `Index` = 'Total';

Here we can see the view displayed in phpMyAdmin. As you can see, it is identical to the original Excel spreadsheet, both in content and behavior. When a new row is added, for instance, it will be inserted after "Prod 6" but before the "Total" row, and all of the appropriate numbers will be recalculated.

pma_ss

With the glorious addition of automation, this is essentially how the new additions would act. Although, they would be capable of interpreting and converting far more complex forumulas than just a simple column sum.

More to come...

You can download all the SQL code used in this example as one file from here: Excel2MySQL.sql

31Jul/090

Google Summer of Code, Pt. 13 [Week 7] – Awesome CSV Import Module is Awesome

Key Accomplishments Last Week:

  • Improved upon the existing CSV import module. Users can now create tables based on a CSV import file. Before, you would have to create the table schema yourself beforehand, which required extra time as well as greater MySQL knowledge. Now it is handled automatically. Presently, the module imports all the CSV content as one table. There is not a reliable way to detect where one table starts and another ends in order to import them separately, but in any case, it is something I will be pondering in the coming weeks. I toyed with splitting the contents into separate tables wherever the column count differed, but if there were two or more tables adjacent to one another with the same column count they would be mashed together. If a do implement a way to detect different tables, it will likely be based on such a method. I feel this is something I should attempt to do, as phpMyAdmin allows exporting not only multiple tables to one CSV file, but also multiple databases. So there is the potential for lots of tables in one CSV file, and if a user wanted to import that, he/she would be required to split the file by hand. Yuck.
  • Put the XLS module on the proverbial back-burner and began work on an XLSX module in its stead, as per Michal's advice.

Key Tasks that Stalled Last Week:

  • The XLS module. As I discussed in my last post, to import XLS files I made use of the phpExcelReader library, which I ended up not being able to include due to incompatible licenses with phpMyAdmin. Rewriting the BIFF library myself would be too big an unscheduled task for an import module which is not all that important. If I have extra time at the end of GSoC, I will rewrite the library for use in PMA.

Key Concerns:

  • Finishing the XLSX module.

Tasks in the Upcoming Week:

  • Determine if there are any additional options or enhancements that the CSV module could make use of, and implement them.
  • Continue work on the XLSX module. It is of a completely different format and arrangement than ODS. For one thing, I needed only concern myself with one of the ODS XML files, mainly "content.xml", whereas here the necessary data is spread across many different files in several directories within the ZIP container.
  • Move quickly so that I may be able to write my own PHP XLS reader class. While XLS compatibility is not necessary, I feel it would still make for a good addition.