DerekSchaefer.NET I do stuff, you read about it!

31Jul/090

Google Summer of Code, Pt. 14 [ODS Import/Export Plug-ins]

Open Document Spreadsheet

good_times.ods

I talked about posting this some time ago, but hadn't gotten around to it as I haven't been feeling well recently. Luckily that's passed, so here we go...

My Open Document Spreadsheet module has come a long way in the last couple weeks. It is now fully capable of:

  • Importing into an existing database
  • Importing any number of sheets
  • Determining MySQL data types for each column
  • Creating the tables and inserting the data into the selected database
  • Extracting column names from the first row of each table
  • Replicating the "Excel-style" column names (A, B, ..., Z, AA, AB, ...)
  • Importing percentages as proper decimals (12.00% -> 0.12)
  • Importing currencies as proper decimals ($5.75 -> 5.75)
  • Ignoring empty rows
  • And more...

There are still a few minor issues to be sorted out, but nothing that is terribly important.

In addition to all this, I have several potential ideas that are almost certainly beyond the scope of the Summer of Code due to their complexity. I would like to propose these here, but I will also be making a duplicate post to the PMA devel mailing list so it reaches the most eyes. It's possible that these ideas are too ambitious compared with their usefulness to the greater phpMyAdmin community, but I could see them becoming a "selling-point,"  if you will, as there is nothing quite like this in existence, as far as I have been able to devise.

The end goal would be to provide a means of importing and exporting spreadsheets (both Open Office and Excel) in their entirety. Currently, we can now import and export spreadsheets, but only the tabular data within. If there are functions of any type in the spreadsheet, they are ignored, and only their current value is imported. For many purposes that is all one needs (or expects), but there are some cases where having these greater possibilities would save a considerable amount of time, and money, for some.

For example, imagine that you have a large Excel spreadsheet that tracks your small business's finances, complete with an army of calculated columns, rows, and other functions. Now imagine that your small business becomes, well, a medium business (or some other equivalent scenario) and you need to access your finances from different locations, make online transactions automatic, etc. The obvious solution would be to turn your spreadsheet-based solution into a true database, in this case MySQL. With the aid of phpMyAdmin, the user would be able to simply import the spreadsheet workbook(s) and all the data would be loaded, as well as all the calculations. The database would go on functioning just as the spreadsheets did before but on a greater scale, and the user would not have to waste time rewriting all the calculations as a MySQL compliant trigger, function, or procedure set.

A simple example would be:

spreadsheet

This type of setup can be directly translated into a MySQL database.

First, create the base table.

CREATE TABLE `Excel_Table` (
 `Product` varchar(25) NOT NULL,
 `Qtr_1` decimal(25,2) NOT NULL,
 `Qtr_2` decimal(25,2) NOT NULL,
 `Total` decimal(25,2) NULL,
 PRIMARY KEY (`Product`)
) ENGINE=MyISAM;

Next, create the table which will contain only the total row at the bottom.

CREATE TABLE `Excel_Table_Total` (
 `Index` varchar(25) NOT NULL,
 `Qtr_1` decimal(25,2) NULL,
 `Qtr_2` decimal(25,2) NULL,
 `Total` decimal(25,2) NULL,
 PRIMARY KEY (`Index`)
) ENGINE=MyISAM;

INSERT INTO `Excel_Table_Total`
    (`Index`, `Qtr_1`, `Qtr_2`, `Total`)
VALUES
    ('Total', NULL, NULL, NULL);

After the tables have been created we can begin implementing the triggers.

-- Handle INSERT changes

DELIMITER //
CREATE TRIGGER `calc_col_insert` BEFORE INSERT ON `Excel_Table`
    FOR EACH ROW BEGIN
        SET NEW.Total = NEW.Qtr_1 + NEW.Qtr_2;
    END;
DELIMITER ;

-- Handle UPDATE changes

DELIMITER //
CREATE TRIGGER `calc_col_update` BEFORE UPDATE ON `Excel_Table`
    FOR EACH ROW BEGIN
        SET NEW.Total = NEW.Qtr_1 + NEW.Qtr_2;
    END;
DELIMITER ;

And the triggers to handle the total row.

-- Handle INSERT changes

DELIMITER //
CREATE TRIGGER `total_row_insert` AFTER INSERT ON `Excel_Table`
    FOR EACH ROW BEGIN
        DECLARE col_sum DECIMAL(25,2);
        SET col_sum = (SELECT SUM(Qtr_1) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Qtr_1` = col_sum WHERE `Index` = 'Total';
        SET col_sum = (SELECT SUM(Qtr_2) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Qtr_2` = col_sum WHERE `Index` = 'Total';
        SET col_sum = (SELECT SUM(Total) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Total` = col_sum WHERE `Index` = 'Total';
    END;
DELIMITER ;

-- Handle UPDATE changes

DELIMITER //
CREATE TRIGGER `total_row_update` AFTER UPDATE ON `Excel_Table`
    FOR EACH ROW BEGIN
        DECLARE col_sum DECIMAL(25,2);
        SET col_sum = (SELECT SUM(Qtr_1) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Qtr_1` = col_sum WHERE `Index` = 'Total';
        SET col_sum = (SELECT SUM(Qtr_2) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Qtr_2` = col_sum WHERE `Index` = 'Total';
        SET col_sum = (SELECT SUM(Total) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Total` = col_sum WHERE `Index` = 'Total';
    END;
DELIMITER ;

-- Handle DELETE changes

DELIMITER //
CREATE TRIGGER `total_row_delete` AFTER DELETE ON `Excel_Table`
    FOR EACH ROW BEGIN
        DECLARE col_sum DECIMAL(25,2);
        SET col_sum = (SELECT SUM(Qtr_1) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Qtr_1` = col_sum WHERE `Index` = 'Total';
        SET col_sum = (SELECT SUM(Qtr_2) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Qtr_2` = col_sum WHERE `Index` = 'Total';
        SET col_sum = (SELECT SUM(Total) FROM `Excel_Table`);
        UPDATE `Excel_Table_Total` SET `Total` = col_sum WHERE `Index` = 'Total';
    END;
DELIMITER ;

Insert the example data.

INSERT INTO `Excel_Table` (`Product`, `Qtr_1`, `Qtr_2`) VALUES
('Prod 1', '744.60', '162.00'),
('Prod 2', '5079.20', '1249.63'),
('Prod 3', '1267.75', '1024.00'),
('Prod 4', '1408.50', '1756.84'),
('Prod 5', '4728.36', '756.42'),
('Prod 6', '943.89', '349.60');

And finally, we create the view to tie everything together.

CREATE VIEW `Excel_Table_View` AS
    SELECT * FROM `Excel_Table`
        UNION
    SELECT * FROM `Excel_Table_Total` WHERE `Index` = 'Total';

Here we can see the view displayed in phpMyAdmin. As you can see, it is identical to the original Excel spreadsheet, both in content and behavior. When a new row is added, for instance, it will be inserted after "Prod 6" but before the "Total" row, and all of the appropriate numbers will be recalculated.

pma_ss

With the glorious addition of automation, this is essentially how the new additions would act. Although, they would be capable of interpreting and converting far more complex forumulas than just a simple column sum.

More to come...

You can download all the SQL code used in this example as one file from here: Excel2MySQL.sql