Monday, 17 September 2007

Moving from MySQL to MS SQL using PEAR MDB2

MDB2 is a PEAR package that performs some abstraction on the native PHP DB calls. In short, it allows to use $DB->query($sql) instead of mysql_query or mssql_query (plus a few more things).

The following article is the list of pitfalls we had to fix in order to get a fully compatible application, that would run the same on MySQL or MSSQL (for now).

Field types

Field types are not the same between the two engines of course, but all in all it is not too bad. ENUMs don't exist in MS SQL so these will have to be moved to CHAR(1), MEDIUMTEXT should be TEXT.

If you want to use MDB2 Schema, beware that it considers ENUM('Y', 'N') to be booleans if the field starts with an "is_" or "has_", but then builds a database with boolean being TINYINT (we'll have to get back on them about that).

SQL syntax

There's a few things that work in MySQL and breaks in MS SQL. Here's what we found:


  • REPLACE INTO: this one is a bugger, we used REPLACE extensively. Well that doesn't exist on MS SQL.

  • ISNULL: ISNULL(somefield) should be somefield IS NULL. That's the ANSI way and ISNULL won't work on MS SQL...

  • IFNULL: IFNULL is ISNULL on MS SQL, but you should really use the function COALESCE, which works the same (when used with two parameters) and is more ANSI.

  • ORDER BY, GROUP BY: on MS SQL, you should be carefull that all the fields that you want to ORDER BY on, should be in the SELECT clause. Same goes for GROUP BY it seems. See further for GROUP BY as it can be a real bugger...

  • COUNT: on MS SQL, COUNT only works with one field somehow

  • DISTINCT: DISTINCT is a bit touchy on MS SQL - you should be carefull not to have fields that have type TEXT for example.

  • NOW: Won't work on MS SQL. MDB2 provides some utility functions to build dates for SQL, but that really does a date('Y-m-d H:i:s')

  • TO_DAYS: Won't work on MS SQL either. We really missed that one.

  • LIMIT: LIMIT doesn't exist in MS SQL, but MDB2 prevides a setLimit() function that will do the trick on all DBMS. Good enough.



AutoIncremented fields

That's a big one. It'd be a good thing that DBMS makers could standardize on something for autoincremented indexes.

In MySQL, if your field id is an autoincremented field, you could do:

INSERT INTO my_table (id, name) VALUES (NULL, 'first');

That won't work on MS SQL (you can't leave a NULL value on them). The proper way to do it with MDB2 is as follows:


$values = array();

$values['name'] = $GLOBALS['mdb2']->quote("first");

$newId = $GLOBALS['mdb2']->extended->getBeforeID('my_table', 'id', true, true);
if($newId != 'NULL')
$values['id'] = $newId;

$sql = "INSERT INTO my_table (".implode(',', array_keys($values)).") ".
" VALUES (" . implode(',', $values) . ")";

$result = $GLOBALS['mdb2']->query($sql);
if (EpiDBTools::IsMDB2Error($result))
return false;

$newId = $GLOBALS['mdb2']->extended->getAfterID($newId, 'my_table', 'id');
if (PEAR::isError($newId))
{
$newId = false;
return false;
}


It's a bit verbose but it works. The getBeforeID is required if you want your code to work some day on PostgreSQL, which uses sequences instead of AutoIncrement values.

Now if you want to insert a row in table with an autoincremented field, and still want to explicit the value of the index, you'll need to enable the IDENTITY_INSERT first, like this:


if ($GLOBALS['mdb2']->dbsyntax == 'mssql')
{
// On MS SQL, allow temporarilly to insert a row by specifying the id
$result = $GLOBALS['mdb2']->query('SET IDENTITY_INSERT my_table ON');
if (PEAR::isError($result))
return false;
}

$sql = "INSERT INTO my_table (id, name) VALUES (4, 'first')";

$result = $GLOBALS['mdb2']->query($sql);
if (PEAR::isError($result))
return false;

if ($GLOBALS['mdb2']->dbsyntax == 'mssql')
{
$result = $GLOBALS['mdb2']->query('SET IDENTITY_INSERT my_table OFF');
if (PEAR::isError($result))
return false;
}


Even more verbose, but it works.

MS SQL settings


  • ANSI_NULL_DFLT_ON: you should set this parameter to ON, if you don't want all your fields to be NOT NULL...

  • IMPLICIT_TRANSACTIONS: beware that if you leave that ON, then you'll have to do COMMITs from time to time, otherwise your changes won't be taken in account in the database. If you come from the MyISAM world, set this one to off (which is ON by default if you set ANSI_DEFAULTS to on).



GROUP BY

In MySQL, I allow myself to do queries like the following to get all the clients that have at least one order:

SELECT * FROM client
INNER JOIN order ON order.client_id = client.id
GROUP BY client.id

That won't work with MS SQL. In the SELECT clause, you are allowed to put things that appear in the GROUP BY clause (ie. client.id), or GROUP BY functions (MAX, AVERAGE, ...). It appears that MySQL is much more flexible than MSSQL regarding to this.

Labels: , , , , ,

Tuesday, 8 May 2007

Dynamic inheritence for application extension in PHP

In any big enough PHP application, comes a time when some of the objects that are used in the application might be overriden by some extensions, to customize the functionnality of the application.

Let's take a example: your application has a class Employee that deals with the logic for every employee in your client's organization. Now imagine you have a client that wants to buy your application but wants to synchronize the employee table with its LDAP directory - fair enough.

What we want to do, is really to extend the Employee class, so that we can change the code of a few functions and add a few others to give your final client the behaviour he asks for.

So we add an extension (a folder really) where we put a class that extends Employee. We call that class LDAP_Employee for the sake of clarity:

class LDAP_Employee extends Employee
{
  function LDAP_Employee($id)
  {
    $this->Employee($id);
  }

  // Some more code to deal with the LDAP directory ...
  // ...
}


Now the problem we have, is that everywhere in our application, the instances are still created out of the original Employee class.

The trick is to use function variables to create the instances.

Let's say you have a global variable defined at the top of the application:

$GLOBALS['ClassDefinitions']['Employee'] = 'Employee'; // Name of the original class

Now, to instanciate an employee we just need to do:

$MyEmployee = new $GLOBALS['ClassDefinitions']['Employee']($MyId);

If we want to use the LDAP_Employee class, we just need to change the global once, high enough in the application initialization routine - where the extensions are loaded, basically:

$GLOBALS['ClassDefinitions']['Employee'] = 'LDAP_Employee'; // Now we'll use the overriden class everywhere

The rest of the code is left unchanged, but the behaviour of the whole application is now changed.

The only downside of this method, is that the overriden classes need to know exactly what class they extend (you can't have a variable after the extend keyword). So if you have a second extension that wanted to extends Employee, you're cooked. (you can, however, have a third extension that extends LDAP_Employee, of course)

Labels: ,

Friday, 6 April 2007

file_exists - the performance killer....

I recently went into some profiling for one of our apps, because a page was taking more than two minutes to display...

I found, to my surprise, that a call to file_exists (in a loop) was taking 95 percent of that time... Wtf ??? I just cached the result to the call to gain that much time... Good to know !

As a side note, I did the profiling with the latest version of PHPEd (Nusphere), great tool.

Labels: , ,

Friday, 5 January 2007

Fun with SCORM 2004

I'm doing some work making one of our app "Scorm 2004" compliant. I'm baffled to see how easy it is now that we have prototype.js, as if the people who designed SCORM way back then had XHRs in mind... In any case, it helps to just do a ajax.request for the commit function...

Now, they didn't always have such a good visionnary view on things, especially when they decided to switch the format of cmi.session_time. WTF is that ? PT3H5M3.5S for 3 hours 5 minutes and 3.5 seconds...? Not only is it complex to parse (for the PHP minded, I give away the code right below), but now look at this: P1Y3M2DT3H equals 1 year, 3 months, 2 days and 3 hours. Her.... What's a month, please ? Is that 30 days ? 31 ? Should I guess ? And why choose the same character for months and minutes ? bah.

Ok. Now some code:
function GetSecondsForScormPeriod($strScormPeriod)
{
$matches = array();

if (!preg_match("/^P([0-9]+Y)?([0-9]+M)?([0-9]+D)?(T([0-9]+H)?([0-9]+M)?([0-9]+(.[0-9]+)?S)?)?$/", $strScormPeriod, $matches))
return 0;

$seconds = 0;
$InTime = false;

foreach ($matches as $aMatch)
{
if ($aMatch{0} == 'P')
continue;

if ($aMatch{0} == 'T')
{
$InTime = true;
continue;
}

$unit = substr($aMatch, -1);
$val = substr($aMatch, 0, -1);

switch ($unit)
{
case 'Y': $seconds += (int)($val) * 365 * 24 * 60 * 60; break;

case 'M':
if (!$InTime)
$seconds += (int)($val) * 30 * 24 * 60 * 60;
else
$seconds += (int)($val) * 60;
break;

case 'D': $seconds += (int)($val) * 24 * 60 * 60; break;
case 'H': $seconds += (int)($val) * 60 * 60; break;
case 'S': $seconds += (float)($val); break;
}
}

return $seconds;
}

Labels: ,