我有一个外部项目,它需要运行一个包含多个SQL查询的SQL文件(如果您愿意,它有点像mysqldump,但是是用户创建的,带有任何类型的SQL语句/定义)。
PHP ()不允许多个查询,当(例如)将站点模块/小部件更新为更高版本时,这个问题经常发生,并且存在某种类型的update.sql文件。非安装的调制解调器。
鉴于以下(1):
//DB connection values
$sHost = "localhost";
$sName = "test";
$sUser = "";
$sPass = "";
$sPort = 3307;
//The following could be retrieved using file_get_contents, or a file streamer
$sFileContents = <<理想的解决方案是在命令行执行SQL文件,如下面的代码所示(使用前面的示例):
//Variant 1: Run a local SQL file. Since we stored our SQL contents in a
//variable (could have been retrieved before using eg. file_get_contents),
//we need to temporarily create a file for this
$sTempFile = tempnam(sys_get_temp_dir(), 'Sql');
//Create the temp file
if(!file_put_contents($sTempFile, $sFileContents)) {
trigger_error("Failed to create temporary file", E_USER_ERROR);
}
//Assemble the command
$sCommand = 'mysql'
. ' --host=' . $sHost
. ' --port=' . $sPort
. ' --user=' . $sUser
. ' --password=' . $sPass
. ' --database=' . $sName
. ' --execute="SOURCE ' . $sTempFile . '"'
;
$sOutput = shell_exec($sCommand);
//Cleanup: remove the temp file
if(!unlink($sTempFile)) {
trigger_error("Failed to remove temporary file", E_USER_ERROR);
}...but一些项目位于共享服务器或其他限制访问服务器上,在这些服务器上不允许执行shell,或者可能无法使用mysql命令。
因此,为了避免导入大量的外部解析库以解决与站点相关的小问题,许多项目循环遍历SQL文件,并将每个;字符拆分为一个新的查询,该查询将单独执行。因为SQL值也可以包含这些字符,因此在特殊情况下,此过程将失败,并且只对一行末尾的字符进行拆分(但对于多行SQL值而言,这仍然失败,该字符出现在末尾)。
我建议的解决方案是正确解析SQL文件/内容,并根据SQL标准拆分查询真正结束的查询。我在这方面寻求帮助。
我的代码(前面的第一个块示例):
//Variant 2: Run a parser
//Connect to the database
$rMysqlI = new mysqli("localhost", "", "", "test", $sPort);
if ($rMysqlI->connect_errno) {
trigger_error("Failed to connect to MySQL: (" . $rMysqlI->connect_errno . ") " . $rMysqlI->connect_error, E_USER_ERROR);
}
//START_OF_PARSER
$iCur = 0; //Current character pointer inside the SQL content
$iInside = 0; //The context, in which the pointer is currently located (is the pointer inside a
//comment, an SQL query, or deeper into an SQL query value?)
$sBuffer = ""; //The buffer of the next individual query
$aQueries = array(); //The list of queries
while($iCur < strlen($sFileContents)) {
switch ($iInside) {
case 0: //Inside query-context
//Change context: Comments beginning with --
if(substr($sFileContents, $iCur, 2) === "--") {
$iCur++;
$iInside = 2;
//Change context: Comments beginning with /*
} elseif(substr($sFileContents, $iCur, 2) === "/*") {
$iCur++;
$iInside = 3;
//Change context: Comments beginning with #
} elseif(substr($sFileContents, $iCur, 1) === "#") {
$iInside = 2;
//Separator for a new query
} elseif(substr($sFileContents, $iCur, 1) === ";") {
$aQueries[] = trim($sBuffer); //$sBuffer; //Add current buffer to a unique array query item
$sBuffer = ""; //Start a new buffer
//Change context: query values opened with '
} elseif(substr($sFileContents, $iCur, 1) === "'") {
$sBuffer .= substr($sFileContents, $iCur, 1);
$iInside = 1;
//Change context: query values opened with "
} elseif(substr($sFileContents, $iCur, 1) === '"') {
$sBuffer .= substr($sFileContents, $iCur, 1);
$iInside = 4;
//Not a special character
} else {
$sBuffer .= substr($sFileContents, $iCur, 1);
}
break;
case 1: //Inside value-context, ending with '
//Escaping character found within the query-value
if(substr($sFileContents, $iCur, 1) === "\\") {
$sBuffer .= substr($sFileContents, $iCur, 2);
$iCur++; //Skip next char
//The ending character for the query-value is found
} elseif(substr($sFileContents, $iCur, 1) === "'") {
$sBuffer .= substr($sFileContents, $iCur, 1);
$iInside = 0;
//Not a special character
} else {
$sBuffer .= substr($sFileContents, $iCur, 1);
}
break;
case 4: //Inside value-context, ending with "
//Escaping character found within the query-value
if(substr($sFileContents, $iCur, 1) === "\\") {
$sBuffer .= substr($sFileContents, $iCur, 2);
$iCur = $iCur + 1; //Skip next char
//The ending character for the query-value is found
} elseif(substr($sFileContents, $iCur, 1) === '"') {
$sBuffer .= substr($sFileContents, $iCur, 1);
$iInside = 0;
//Not a special character
} else {
$sBuffer .= substr($sFileContents, $iCur, 1);
}
break;
case 2: //Inside comment-context, ending with newline
//A two-character newline is found, signalling the end of the comment
if(substr($sFileContents, $iCur, 2) === "\r\n") {
$iCur++;
$iInside = 0;
//A single-character newline is found, signalling the end of the comment
} elseif(substr($sFileContents, $iCur, 1) === "\n" || substr($sFileContents, $iCur, 1) === "\r") {
$iInside = 0;
}
break;
case 3: //Inside comment-context, ending with */
//A two-character */ is found, signalling the end of the comment
if(substr($sFileContents, $iCur, 2) === "*/") {
$iCur++;
$iInside = 0;
}
break;
default:
break;
}
$iCur++;
}
//END_OF_PARSER
//Preview our results
foreach($aQueries as $sQuery) {
if (!$rMysqlI->query($sQuery)) {
echo "ERROR \"{$sQuery}\": (" . $rMysqlI->errno . ") " . $rMysqlI->error . "
", E_USER_ERROR;
} else {
echo "SUCCESS \"{$sQuery}\"
", E_USER_ERROR;
}
}对于我的问题领域,我目前只对START_OF_PARSER和END_OF_PARSER之间的内容感兴趣。解析器位,或将SQL内容转换为单个查询数组的代码块,这些查询可以/可能在稍后单独执行(如上面的foreach所做),产生与使用变体1代码块相同的结果。
我对我的代码的担忧是:
查询值封装字符:它们总是‘或",还是使用了其他查询值封装字符?查询-值转义字符:我不认为使用您自己的转义技巧(安全原因.)是个好主意,但是现在忽略安全性考虑,我假设只需要寻找一个转义字符(反斜杠),就可以发现哪个查询值字符才是真正的查询值字符。但是反斜杠是其他SQL方言中唯一的转义字符(或方法)吗?Comment忽略:为了删除注释,我查找/*和*/、-和换行符、#和换行符。够了吗?DB抽象:我假设是MySQL,但我知道我要解决的问题是PearDB,它可能是Oracle、PostGreSQL或LiteSQL等等。这些SQL方言是否也有相同的字符用于注释、转义、值封装等等?这对大多数DBs有用吗?执行这段代码似乎可以在Windows (大概也是Linux )下工作。
发布于 2018-02-14 03:49:11
为什么要构建SQL解析器?对于更适合于其他方法的用例来说,这似乎是大量的工作和潜在的脆弱性。您正在处理一个大型SQL脚本。以这样的方式执行。别把它分解了。
https://codereview.stackexchange.com/questions/187488
复制相似问题