Message ID | 20200703031040.27579-1-kevr.gtalk@gmail.com |
---|---|
State | New |
Headers | show |
Series | Bump RPC API Version 6 | expand |
On Thu, 02 Jul 2020 at 23:10:40, Kevin Morris wrote: > [...] > Signed-off-by: Kevin Morris <kevr.gtalk@gmail.com> > --- > doc/rpc.txt | 4 +++ > web/lib/aurjson.class.php | 66 ++++++++++++++++++++++++++++++++++----- > 2 files changed, 62 insertions(+), 8 deletions(-) Thanks Kevin! A few comments below. > diff --git a/doc/rpc.txt b/doc/rpc.txt > index 3148ebea..b0f5c4e1 100644 > @@ -472,6 +472,33 @@ class AurJSON { > return array('ids' => $id_args, 'names' => $name_args); > } > > + /* > + * Prepare a WHERE statement for each keyword: append $func($keyword) > + * separated by $delim. Each keyword is sanitized as wildcards before > + * it's passed to $func. What does that last part of the comment mean? Doesn't sanitization happen in $func itself? > + * > + * @param $delim Delimiter to use in the middle of two keywords. > + * @param $keywords Array of keywords to prepare. > + * @param $func A function that returns a string. This value is concatenated. > + * > + * @return A WHERE condition statement of keywords separated by $delim. > + */ > + private function join_where($delim, $keywords, $func) { > + // Applied to each item to concatenate our entire statement. > + $reduce_func = function($carry, $item) use(&$func) { > + array_push($carry, $func($item)); > + return $carry; > + }; > + > + // Manual array_reduce with a local lambda. > + $acc = array(); // Initial > + foreach ($keywords as &$keyword) { > + $acc += $reduce_func($acc, $keyword); I am slightly puzzled by the implementation here; why is there a need to have array_push() above and also use the += operator? Can't we simply call $func() directly and use array_push() here? > + } > + > + return join(" $delim ", $acc); It might make sense to just use $delim here and use " AND " in the caller (I actually skipped this function on my first read and was asking myself why the caller doesn't put spaces around "AND", that's the behavior people expect from a join function). Either that or rename the function slightly; in the latter case it could even make sense to drop the parameter altogether and hardcode the AND, unless you think the function could be used for something else in the future.
Thanks for the look/review Lukas; fixed up some things in the code that clears all these points up -- the comments for `join_where` were outdated, forgot to update it -- renamed some variables and swapped to `array_reduce` instead of the manual loop. Now that I can see, there's no reason that it wasn't `array_reduce` before other than some of my personal confusion with php and it's error messages. `$delim` is fixed up so the caller just includes spaces if they want like you thought -- and i do like it more that it's standardized. I'd like to leave join_where there because I use it in two places and didn't really want to repeat the logic -- I'm also foreseeing that it can be used later to create more WHERE statements out of lists. (Hopefully) final patch being submitted within a few minutes. On Sat, Jul 4, 2020 at 6:05 AM Lukas Fleischer <lfleischer@archlinux.org> wrote: > On Thu, 02 Jul 2020 at 23:10:40, Kevin Morris wrote: > > [...] > > Signed-off-by: Kevin Morris <kevr.gtalk@gmail.com> > > --- > > doc/rpc.txt | 4 +++ > > web/lib/aurjson.class.php | 66 ++++++++++++++++++++++++++++++++++----- > > 2 files changed, 62 insertions(+), 8 deletions(-) > > Thanks Kevin! A few comments below. > > > diff --git a/doc/rpc.txt b/doc/rpc.txt > > index 3148ebea..b0f5c4e1 100644 > > @@ -472,6 +472,33 @@ class AurJSON { > > return array('ids' => $id_args, 'names' => $name_args); > > } > > > > + /* > > + * Prepare a WHERE statement for each keyword: append > $func($keyword) > > + * separated by $delim. Each keyword is sanitized as wildcards > before > > + * it's passed to $func. > > What does that last part of the comment mean? Doesn't sanitization > happen in $func itself? > > > + * > > + * @param $delim Delimiter to use in the middle of two keywords. > > + * @param $keywords Array of keywords to prepare. > > + * @param $func A function that returns a string. This value is > concatenated. > > + * > > + * @return A WHERE condition statement of keywords separated by > $delim. > > + */ > > + private function join_where($delim, $keywords, $func) { > > + // Applied to each item to concatenate our entire > statement. > > + $reduce_func = function($carry, $item) use(&$func) { > > + array_push($carry, $func($item)); > > + return $carry; > > + }; > > + > > + // Manual array_reduce with a local lambda. > > + $acc = array(); // Initial > > + foreach ($keywords as &$keyword) { > > + $acc += $reduce_func($acc, $keyword); > > I am slightly puzzled by the implementation here; why is there a need to > have array_push() above and also use the += operator? Can't we simply > call $func() directly and use array_push() here? > > > + } > > + > > + return join(" $delim ", $acc); > > It might make sense to just use $delim here and use " AND " in the > caller (I actually skipped this function on my first read and was asking > myself why the caller doesn't put spaces around "AND", that's the > behavior people expect from a join function). Either that or rename the > function slightly; in the latter case it could even make sense to drop > the parameter altogether and hardcode the AND, unless you think the > function could be used for something else in the future. >
diff --git a/doc/rpc.txt b/doc/rpc.txt index 3148ebea..b0f5c4e1 100644 --- a/doc/rpc.txt +++ b/doc/rpc.txt @@ -39,6 +39,10 @@ Examples `/rpc/?v=5&type=search&by=makedepends&arg=boost` `search` with callback:: `/rpc/?v=5&type=search&arg=foobar&callback=jsonp1192244621103` +`search` with API Version 6 for packages containing `cookie` AND `milk`:: + `/rpc/?v=6&type=search&arg=cookie%20milk` +`search` with API Version 6 for packages containing `cookie milk`:: + `/rpc/?v=6&type=search&arg="cookie milk"` `info`:: `/rpc/?v=5&type=info&arg[]=foobar` `info` with multiple packages:: diff --git a/web/lib/aurjson.class.php b/web/lib/aurjson.class.php index 0ac586fe..a3224e57 100644 --- a/web/lib/aurjson.class.php +++ b/web/lib/aurjson.class.php @@ -80,7 +80,7 @@ class AurJSON { if (isset($http_data['v'])) { $this->version = intval($http_data['v']); } - if ($this->version < 1 || $this->version > 5) { + if ($this->version < 1 || $this->version > 6) { return $this->json_error('Invalid version specified.'); } @@ -140,7 +140,7 @@ class AurJSON { } /* - * Check if an IP needs to be rate limited. + * Check if an IP needs to be rate limited. * * @param $ip IP of the current request * @@ -192,7 +192,7 @@ class AurJSON { $value = get_cache_value('ratelimit-ws:' . $ip, $status); if (!$status || ($status && $value < $deletion_time)) { if (set_cache_value('ratelimit-ws:' . $ip, $time, $window_length) && - set_cache_value('ratelimit:' . $ip, 1, $window_length)) { + set_cache_value('ratelimit:' . $ip, 1, $window_length)) { return; } } else { @@ -370,7 +370,7 @@ class AurJSON { } elseif ($this->version >= 2) { if ($this->version == 2 || $this->version == 3) { $fields = implode(',', self::$fields_v2); - } else if ($this->version == 4 || $this->version == 5) { + } else if ($this->version >= 4 && $this->version <= 6) { $fields = implode(',', self::$fields_v4); } $query = "SELECT {$fields} " . @@ -472,6 +472,33 @@ class AurJSON { return array('ids' => $id_args, 'names' => $name_args); } + /* + * Prepare a WHERE statement for each keyword: append $func($keyword) + * separated by $delim. Each keyword is sanitized as wildcards before + * it's passed to $func. + * + * @param $delim Delimiter to use in the middle of two keywords. + * @param $keywords Array of keywords to prepare. + * @param $func A function that returns a string. This value is concatenated. + * + * @return A WHERE condition statement of keywords separated by $delim. + */ + private function join_where($delim, $keywords, $func) { + // Applied to each item to concatenate our entire statement. + $reduce_func = function($carry, $item) use(&$func) { + array_push($carry, $func($item)); + return $carry; + }; + + // Manual array_reduce with a local lambda. + $acc = array(); // Initial + foreach ($keywords as &$keyword) { + $acc += $reduce_func($acc, $keyword); + } + + return join(" $delim ", $acc); + } + /* * Performs a fulltext mysql search of the package database. * @@ -492,13 +519,36 @@ class AurJSON { if (strlen($keyword_string) < 2) { return $this->json_error('Query arg too small.'); } - $keyword_string = $this->dbh->quote("%" . addcslashes($keyword_string, '%_') . "%"); if ($search_by === 'name') { - $where_condition = "(Packages.Name LIKE $keyword_string)"; + if ($this->version >= 6) { + $keywords = preg_split('/"([^"]*)"|\h+/', $keyword_string, -1, + PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE); + $func = function($keyword) { + $str = $this->dbh->quote("%" . addcslashes($keyword, '%_') . "%"); + return "(Packages.Name LIKE $str)"; + }; + $where_condition = $this->join_where("AND", $keywords, $func); + } else { + $keyword_string = $this->dbh->quote( + "%" . addcslashes($keyword_string, '%_') . "%"); + $where_condition = "(Packages.Name LIKE $keyword_string)"; + } } else if ($search_by === 'name-desc') { - $where_condition = "(Packages.Name LIKE $keyword_string OR "; - $where_condition .= "Description LIKE $keyword_string)"; + if ($this->version >= 6) { + $keywords = preg_split('/"([^"]*)"|\h+/', $keyword_string, -1, + PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE); + $func = function($keyword) { + $str = $this->dbh->quote("%" . addcslashes($keyword, '%_') . "%"); + return "(Packages.Name LIKE $str OR Description LIKE $str)"; + }; + $where_condition = $this->join_where("AND", $keywords, $func); + } else { + $keyword_string = $this->dbh->quote( + "%" . addcslashes($keyword_string, '%_') . "%"); + $where_condition = "(Packages.Name LIKE $keyword_string "; + $where_condition .= "OR Description LIKE $keyword_string)"; + } } } else if ($search_by === 'maintainer') { if (empty($keyword_string)) {
API Version 6 modifies `type=search` functionality: Split `arg` into keywords separated by spaces (quoted keywords are preserved). Search for packages containing the literal keyword `blah blah` AND `haha`: https://aur.archlinux.org/rpc/?v=6&type=search&arg="blah blah"%20haha Search for packages containing the literal keyword `abc 123`: https://aur.archlinux.org/rpc/?v=6&type=search&arg="abc 123" The following example searches for packages that contain `blah` AND `abc`: https://aur.archlinux.org/rpc/?v=6&type=search&arg=blah%20abc The legacy method still searches for packages that contain 'blah abc': https://aur.archlinux.org/rpc/?v=5&type=search&arg=blah%20abc https://aur.archlinux.org/rpc/?v=5&type=search&arg=blah%20abc API Version 6 is only considered during a `search` of `name` or `name-desc`. Note: This change was written as a solution to https://bugs.archlinux.org/task/49133. PS: + Some spacing issues fixed in comments. Signed-off-by: Kevin Morris <kevr.gtalk@gmail.com> --- doc/rpc.txt | 4 +++ web/lib/aurjson.class.php | 66 ++++++++++++++++++++++++++++++++++----- 2 files changed, 62 insertions(+), 8 deletions(-)