Bump RPC API Version 6

Message ID 20200704211427.20535-1-kevr.gtalk@gmail.com
State New
Headers show
Series Bump RPC API Version 6 | expand

Commit Message

Kevin Morris July 4, 2020, 9:14 p.m. UTC
API Version 6 modifies `type=search` functionality: Split `arg` into
keywords separated by spaces (quoted keywords are preserved).

Search for packages containing the literal keyword `blah blah` AND `haha`:
https://aur.archlinux.org/rpc/?v=6&type=search&arg="blah blah"%20haha

Search for packages containing the literal keyword `abc 123`:
https://aur.archlinux.org/rpc/?v=6&type=search&arg="abc 123"

The following example searches for packages that contain `blah` AND `abc`:
https://aur.archlinux.org/rpc/?v=6&type=search&arg=blah%20abc

The legacy method still searches for packages that contain 'blah abc':
https://aur.archlinux.org/rpc/?v=5&type=search&arg=blah%20abc
https://aur.archlinux.org/rpc/?v=5&type=search&arg=blah%20abc

API Version 6 is only considered during a `search` of `name` or
`name-desc`.

Note: This change was written as a solution to
https://bugs.archlinux.org/task/49133.

PS: + Some spacing issues fixed in comments.

Signed-off-by: Kevin Morris <kevr.gtalk@gmail.com>
---
 doc/rpc.txt               |  4 +++
 web/lib/aurjson.class.php | 60 +++++++++++++++++++++++++++++++++------
 2 files changed, 56 insertions(+), 8 deletions(-)

Comments

Lukas Fleischer July 5, 2020, 12:30 a.m. UTC | #1
On Sat, 04 Jul 2020 at 17:14:26, Kevin Morris wrote:
> API Version 6 modifies `type=search` functionality: Split `arg` into
> keywords separated by spaces (quoted keywords are preserved).
> [...]
> Signed-off-by: Kevin Morris <kevr.gtalk@gmail.com>
> ---
>  doc/rpc.txt               |  4 +++
>  web/lib/aurjson.class.php | 60 +++++++++++++++++++++++++++++++++------
>  2 files changed, 56 insertions(+), 8 deletions(-)

Thanks for the quick revision. Looks great now, I merged this as

    Support conjunctive keyword search in RPC interface

to pu (I think it is better to refer to the new feature in the subject
line and put aside "Bump RPC API Version" for patches that just bump the
API version in case it was forgotten in a previous patch).

Now that this is implemented, I just wanted to suggest implementing the
same functionality for the web interface until I noticed that we already
do and it's even more powerful: we additionally support "OR" and "NOT"
there; see construct_keyword_search() in pkgfuncs.inc.php. Would you be
open to working on another revision of this patch that refactors
construct_keyword_search() a little bit (to make it usable for the RPC
interface) and then uses that for API version 6 instead?
Kevin Morris July 5, 2020, 12:52 a.m. UTC | #2
Hey Lukas,

Awesome! I do like the sound better of using the same exact search sql for
both RPC and Web, and so your suggestion seems like a great idea to me.
I'll start looking at this today!

On Sat, Jul 4, 2020 at 5:30 PM Lukas Fleischer <lfleischer@archlinux.org>
wrote:

> On Sat, 04 Jul 2020 at 17:14:26, Kevin Morris wrote:
> > API Version 6 modifies `type=search` functionality: Split `arg` into
> > keywords separated by spaces (quoted keywords are preserved).
> > [...]
> > Signed-off-by: Kevin Morris <kevr.gtalk@gmail.com>
> > ---
> >  doc/rpc.txt               |  4 +++
> >  web/lib/aurjson.class.php | 60 +++++++++++++++++++++++++++++++++------
> >  2 files changed, 56 insertions(+), 8 deletions(-)
>
> Thanks for the quick revision. Looks great now, I merged this as
>
>     Support conjunctive keyword search in RPC interface
>
> to pu (I think it is better to refer to the new feature in the subject
> line and put aside "Bump RPC API Version" for patches that just bump the
> API version in case it was forgotten in a previous patch).
>
> Now that this is implemented, I just wanted to suggest implementing the
> same functionality for the web interface until I noticed that we already
> do and it's even more powerful: we additionally support "OR" and "NOT"
> there; see construct_keyword_search() in pkgfuncs.inc.php. Would you be
> open to working on another revision of this patch that refactors
> construct_keyword_search() a little bit (to make it usable for the RPC
> interface) and then uses that for API version 6 instead?
>
Lukas Fleischer July 5, 2020, 1:16 a.m. UTC | #3
On Sat, 04 Jul 2020 at 20:52:09, Kevin Morris wrote:
> Awesome! I do like the sound better of using the same exact search sql for
> both RPC and Web, and so your suggestion seems like a great idea to me.
> I'll start looking at this today!

Sounds great, I am looking forward to a new patch. And sorry for not
thinking this through more carefully earlier!

Patch

diff --git a/doc/rpc.txt b/doc/rpc.txt
index 3148ebea..b0f5c4e1 100644
--- a/doc/rpc.txt
+++ b/doc/rpc.txt
@@ -39,6 +39,10 @@  Examples
   `/rpc/?v=5&type=search&by=makedepends&arg=boost`
 `search` with callback::
   `/rpc/?v=5&type=search&arg=foobar&callback=jsonp1192244621103`
+`search` with API Version 6 for packages containing `cookie` AND `milk`::
+  `/rpc/?v=6&type=search&arg=cookie%20milk`
+`search` with API Version 6 for packages containing `cookie milk`::
+  `/rpc/?v=6&type=search&arg="cookie milk"`
 `info`::
   `/rpc/?v=5&type=info&arg[]=foobar`
 `info` with multiple packages::
diff --git a/web/lib/aurjson.class.php b/web/lib/aurjson.class.php
index 0ac586fe..b639653f 100644
--- a/web/lib/aurjson.class.php
+++ b/web/lib/aurjson.class.php
@@ -80,7 +80,7 @@  class AurJSON {
 		if (isset($http_data['v'])) {
 			$this->version = intval($http_data['v']);
 		}
-		if ($this->version < 1 || $this->version > 5) {
+		if ($this->version < 1 || $this->version > 6) {
 			return $this->json_error('Invalid version specified.');
 		}
 
@@ -140,7 +140,7 @@  class AurJSON {
 	}
 
 	/*
-	 * Check if an IP needs to be  rate limited.
+	 * Check if an IP needs to be rate limited.
 	 *
 	 * @param $ip IP of the current request
 	 *
@@ -192,7 +192,7 @@  class AurJSON {
 		$value = get_cache_value('ratelimit-ws:' . $ip, $status);
 		if (!$status || ($status && $value < $deletion_time)) {
 			if (set_cache_value('ratelimit-ws:' . $ip, $time, $window_length) &&
-			    set_cache_value('ratelimit:' . $ip, 1, $window_length)) {
+				set_cache_value('ratelimit:' . $ip, 1, $window_length)) {
 				return;
 			}
 		} else {
@@ -370,7 +370,7 @@  class AurJSON {
 		} elseif ($this->version >= 2) {
 			if ($this->version == 2 || $this->version == 3) {
 				$fields = implode(',', self::$fields_v2);
-			} else if ($this->version == 4 || $this->version == 5) {
+			} else if ($this->version >= 4 && $this->version <= 6) {
 				$fields = implode(',', self::$fields_v4);
 			}
 			$query = "SELECT {$fields} " .
@@ -472,6 +472,27 @@  class AurJSON {
 		return array('ids' => $id_args, 'names' => $name_args);
 	}
 
+	/*
+	 * Prepare a WHERE statement for each $transform($keyword), separated
+	 * by $delim.
+	 *
+	 * @param $delim Delimiter used to join several keywords.
+	 * @param $keywords Array of keywords.
+	 * @param $transform Transformation function that receives a keyword
+	 *				and returns a transformed string.
+	 *
+	 * @return string WHERE condition statement of transformed keywords
+	 *				 separated by $delim.
+	 */
+	private function join_where($delim, $keywords, $transform) {
+		$reduce_func = function($carry, $item) use(&$transform) {
+			array_push($carry, $transform($item));
+			return $carry;
+		};
+		$result = array_reduce($keywords, $reduce_func, array());
+		return join($delim, $result);
+	}
+
 	/*
 	 * Performs a fulltext mysql search of the package database.
 	 *
@@ -492,13 +513,36 @@  class AurJSON {
 			if (strlen($keyword_string) < 2) {
 				return $this->json_error('Query arg too small.');
 			}
-			$keyword_string = $this->dbh->quote("%" . addcslashes($keyword_string, '%_') . "%");
 
 			if ($search_by === 'name') {
-				$where_condition = "(Packages.Name LIKE $keyword_string)";
+				if ($this->version >= 6) {
+					$keywords = preg_split('/"([^"]*)"|\h+/', $keyword_string, -1, 
+						PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
+					$func = function($keyword) {
+						$str = $this->dbh->quote("%" . addcslashes($keyword, '%_') . "%");
+						return "(Packages.Name LIKE $str)";
+					};
+					$where_condition = $this->join_where(" AND ", $keywords, $func);
+				} else {
+					$keyword_string = $this->dbh->quote(
+						"%" . addcslashes($keyword_string, '%_') . "%");
+					$where_condition = "(Packages.Name LIKE $keyword_string)";
+				}
 			} else if ($search_by === 'name-desc') {
-				$where_condition = "(Packages.Name LIKE $keyword_string OR ";
-				$where_condition .= "Description LIKE $keyword_string)";
+				if ($this->version >= 6) {
+					$keywords = preg_split('/"([^"]*)"|\h+/', $keyword_string, -1, 
+						PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
+					$func = function($keyword) {
+						$str = $this->dbh->quote("%" . addcslashes($keyword, '%_') . "%");
+						return "(Packages.Name LIKE $str OR Description LIKE $str)";
+					};
+					$where_condition = $this->join_where(" AND ", $keywords, $func);
+				} else {
+					$keyword_string = $this->dbh->quote(
+						"%" . addcslashes($keyword_string, '%_') . "%");
+					$where_condition = "(Packages.Name LIKE $keyword_string ";
+					$where_condition .= "OR Description LIKE $keyword_string)";
+				}
 			}
 		} else if ($search_by === 'maintainer') {
 			if (empty($keyword_string)) {