From patchwork Mon Jul 6 01:19:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Morris X-Patchwork-Id: 1710 Return-Path: Delivered-To: patchwork@archlinux.org Received: from apollo.archlinux.org (localhost [127.0.0.1]) by apollo.archlinux.org (Postfix) with ESMTP id 645E519C1A798 for ; Mon, 6 Jul 2020 01:19:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on apollo.archlinux.org X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=DKIM_ADSP_CUSTOM_MED=0.001, DKIM_INVALID=1,DKIM_SIGNED=0.1,FREEMAIL_FROM=0.5,MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_MED=-2.3,RCVD_IN_MSPIKE_H4=0.001,RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001,T_DMARC_POLICY_NONE=0.01,T_DMARC_SIMPLE_DKIM=0.01 autolearn=ham autolearn_force=no version=3.4.4 X-Spam-BL-Results: [127.0.9.2] [127.0.0.19] Received: from orion.archlinux.org (orion.archlinux.org [88.198.91.70]) by apollo.archlinux.org (Postfix) with ESMTPS for ; Mon, 6 Jul 2020 01:19:40 +0000 (UTC) Received: from orion.archlinux.org (localhost [127.0.0.1]) by orion.archlinux.org (Postfix) with ESMTP id 52FEB1D358A5B6; Mon, 6 Jul 2020 01:19:33 +0000 (UTC) Received: from luna.archlinux.org (luna.archlinux.org [IPv6:2a01:4f8:160:3033::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits)) (No client certificate requested) (Authenticated sender: luna) by orion.archlinux.org (Postfix) with ESMTPSA id 37C7D1D358A5B3; Mon, 6 Jul 2020 01:19:33 +0000 (UTC) Authentication-Results: orion.archlinux.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=vMfz9SGz Received: from luna.archlinux.org (luna.archlinux.org [127.0.0.1]) by luna.archlinux.org (Postfix) with ESMTP id 2BA0029CB7; Mon, 6 Jul 2020 01:19:33 +0000 (UTC) Authentication-Results: luna.archlinux.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=vMfz9SGz Received: from luna.archlinux.org (luna.archlinux.org [127.0.0.1]) by luna.archlinux.org (Postfix) with ESMTP id 4897029CB4 for ; Mon, 6 Jul 2020 01:19:28 +0000 (UTC) Received: from orion.archlinux.org (orion.archlinux.org [88.198.91.70]) by luna.archlinux.org (Postfix) with ESMTPS for ; Mon, 6 Jul 2020 01:19:28 +0000 (UTC) Received: from orion.archlinux.org (localhost [127.0.0.1]) by orion.archlinux.org (Postfix) with ESMTP id 2BF0E1D358A56E for ; Mon, 6 Jul 2020 01:19:20 +0000 (UTC) Received: from mail-pj1-x1041.google.com (mail-pj1-x1041.google.com [IPv6:2607:f8b0:4864:20::1041]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by orion.archlinux.org (Postfix) with ESMTPS for ; Mon, 6 Jul 2020 01:19:20 +0000 (UTC) Received: by mail-pj1-x1041.google.com with SMTP id t15so1877649pjq.5 for ; Sun, 05 Jul 2020 18:19:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Ra/LwyUVKwa3VyuXdvSKdXKgVIr/HRffz5unzJ8/UTg=; b=vMfz9SGznx1L/4Z33RnIjSkHiEgJ8dThmydgQFnMjdoIuO2QOaZHMDlL+HLgEAKvvc B3lf+eYJqtJDUmzhkXl2VkrV2uzGv+UR3qlyFo69YTMRQqzRx1Pd7Q/GInLmRDtYBd5f 4Xk5Rk0CJBSuuU4D9jkCclymSZD05HQ8wOS3kA6qiFwGwQnjvRZzHwE+uqRECpz+x2p8 EiTRNY5NOUJyzZBblSIDffVRQfauS+O6JQFml++37FufcEo14QkAb26sIdxAS3OSkK4S NEwVDJxOaR4YXS+RwSd+XpWj9PbIGbcC/jtrU4DBEWkzUWjKym0T2wotfVkQfjMnjRu+ Goxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Ra/LwyUVKwa3VyuXdvSKdXKgVIr/HRffz5unzJ8/UTg=; b=TfmyH30qfUOgp1q9fVM5LRDLmVHM4nOhe/f6/ZtFilZS9ida5YVbHGZ33J8J/ByIcs 8O+YslNYXYO64UAF0f4l0zlmeuTtRiyUR1hgbZA9ROwOyKL9tWzHUxUVAKhNlc88WWtn a5SA9iLna7Yd805IAhaj/GCFNHaaemnNJm43Nk4+s16oE7l/z95Dw3HWMjvDGmWs1v/J +1xUSm0UHlAUJntL1v5AAdTBOOod24VXjoDDrDsjf4QtpBTlqnOFYq5hz/L8AJ0e9vkX ydHZwEDxXQ1kSroX1OwYzQ03cTnk7W6hmdUCtEbE/rlg3q0nQasR42Mb/+mCBU8LMebG OX3Q== X-Gm-Message-State: AOAM531VA0Wm/g7uGVUZxkXWcwXimRMB6V4WkfUOo+x++de4NekcfN32 jEiUNMI0GBIH34xm50K/TvYnMZXkpZw= X-Google-Smtp-Source: ABdhPJzqc/XsP7olsK5T8ZnJl9kKOmFaCIl0L204WoSrkuVUZ4Qe7NW2dfZW3V119SahR7Wy/6ZEpQ== X-Received: by 2002:a17:902:be0e:: with SMTP id r14mr13675422pls.309.1593998357758; Sun, 05 Jul 2020 18:19:17 -0700 (PDT) Received: from localhost.localdomain ([2600:1:9a05:3a11:14d2:c57b:c90c:fa96]) by smtp.gmail.com with ESMTPSA id x66sm16043149pgb.12.2020.07.05.18.19.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Jul 2020 18:19:17 -0700 (PDT) From: Kevin Morris To: aur-dev@archlinux.org Subject: [PATCH v3] Support conjunctive keyword search in RPC interface Date: Sun, 5 Jul 2020 18:19:06 -0700 Message-Id: <20200706011906.7214-1-kevr.gtalk@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: References: MIME-Version: 1.0 X-BeenThere: aur-dev@archlinux.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: "Arch User Repository \(AUR\) Development" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: aur-dev-bounces@archlinux.org Sender: "aur-dev" Newly supported API Version 6 modifies `type=search` for _by_ type `name-desc`: it now behaves the same as `name-desc` search through the https://aur.archlinux.org/packages/ search page. Search for packages containing the literal keyword `blah blah` AND `haha`: https://aur.archlinux.org/rpc/?v=6&type=search&arg="blah blah"%20haha Search for packages containing the literal keyword `abc 123`: https://aur.archlinux.org/rpc/?v=6&type=search&arg="abc 123" The following example searches for packages that contain `blah` AND `abc`: https://aur.archlinux.org/rpc/?v=6&type=search&arg=blah%20abc The legacy method still searches for packages that contain `blah abc`: https://aur.archlinux.org/rpc/?v=5&type=search&arg=blah%20abc https://aur.archlinux.org/rpc/?v=5&type=search&arg=blah%20abc API Version 6 is currently only considered during a `search` of `name-desc`. Note: This change was written as a solution to https://bugs.archlinux.org/task/49133. PS: + Some spacing issues fixed in comments. Signed-off-by: Kevin Morris --- doc/rpc.txt | 4 ++++ web/lib/aurjson.class.php | 33 +++++++++++++++++++++++---------- web/lib/pkgfuncs.inc.php | 36 +++++++++++++++++++++++------------- 3 files changed, 50 insertions(+), 23 deletions(-) diff --git a/doc/rpc.txt b/doc/rpc.txt index 3148ebea..b0f5c4e1 100644 --- a/doc/rpc.txt +++ b/doc/rpc.txt @@ -39,6 +39,10 @@ Examples `/rpc/?v=5&type=search&by=makedepends&arg=boost` `search` with callback:: `/rpc/?v=5&type=search&arg=foobar&callback=jsonp1192244621103` +`search` with API Version 6 for packages containing `cookie` AND `milk`:: + `/rpc/?v=6&type=search&arg=cookie%20milk` +`search` with API Version 6 for packages containing `cookie milk`:: + `/rpc/?v=6&type=search&arg="cookie milk"` `info`:: `/rpc/?v=5&type=info&arg[]=foobar` `info` with multiple packages:: diff --git a/web/lib/aurjson.class.php b/web/lib/aurjson.class.php index 0ac586fe..b92e9094 100644 --- a/web/lib/aurjson.class.php +++ b/web/lib/aurjson.class.php @@ -1,6 +1,7 @@ version = intval($http_data['v']); } - if ($this->version < 1 || $this->version > 5) { + if ($this->version < 1 || $this->version > 6) { return $this->json_error('Invalid version specified.'); } @@ -140,7 +141,7 @@ class AurJSON { } /* - * Check if an IP needs to be rate limited. + * Check if an IP needs to be rate limited. * * @param $ip IP of the current request * @@ -192,7 +193,7 @@ class AurJSON { $value = get_cache_value('ratelimit-ws:' . $ip, $status); if (!$status || ($status && $value < $deletion_time)) { if (set_cache_value('ratelimit-ws:' . $ip, $time, $window_length) && - set_cache_value('ratelimit:' . $ip, 1, $window_length)) { + set_cache_value('ratelimit:' . $ip, 1, $window_length)) { return; } } else { @@ -370,7 +371,7 @@ class AurJSON { } elseif ($this->version >= 2) { if ($this->version == 2 || $this->version == 3) { $fields = implode(',', self::$fields_v2); - } else if ($this->version == 4 || $this->version == 5) { + } else if ($this->version >= 4 && $this->version <= 6) { $fields = implode(',', self::$fields_v4); } $query = "SELECT {$fields} " . @@ -492,13 +493,25 @@ class AurJSON { if (strlen($keyword_string) < 2) { return $this->json_error('Query arg too small.'); } - $keyword_string = $this->dbh->quote("%" . addcslashes($keyword_string, '%_') . "%"); - if ($search_by === 'name') { - $where_condition = "(Packages.Name LIKE $keyword_string)"; - } else if ($search_by === 'name-desc') { - $where_condition = "(Packages.Name LIKE $keyword_string OR "; - $where_condition .= "Description LIKE $keyword_string)"; + if ($this->version >= 6 && $search_by === 'name-desc') { + + // API Version 6 with _by_ `name-desc`, create a conjunctive query. + $where_condition = construct_keyword_search($this->dbh, + $keyword_string, true, false); + + } else { + + $keyword_string = $this->dbh->quote( + "%" . addcslashes($keyword_string, '%_') . "%"); + + if ($search_by === 'name') { + $where_condition = "(Packages.Name LIKE $keyword_string)"; + } else if ($search_by === 'name-desc') { + $where_condition = "(Packages.Name LIKE $keyword_string "; + $where_condition .= "OR Description LIKE $keyword_string)"; + } + } } else if ($search_by === 'maintainer') { if (empty($keyword_string)) { diff --git a/web/lib/pkgfuncs.inc.php b/web/lib/pkgfuncs.inc.php index 8c915711..83d49b0f 100644 --- a/web/lib/pkgfuncs.inc.php +++ b/web/lib/pkgfuncs.inc.php @@ -696,8 +696,10 @@ function pkg_search_page($params, $show_headers=true, $SID="") { $q_where .= "AND (PackageBases.Name LIKE " . $dbh->quote($K) . ") "; } elseif (isset($params["SeB"]) && $params["SeB"] == "k") { - /* Search by keywords. */ - $q_where .= construct_keyword_search($dbh, $params['K'], false); + /* Search by name. */ + $q_where .= "AND ("; + $q_where .= construct_keyword_search($dbh, $params['K'], false, true); + $q_where .= ") "; } elseif (isset($params["SeB"]) && $params["SeB"] == "N") { /* Search by name (exact match). */ @@ -709,7 +711,9 @@ function pkg_search_page($params, $show_headers=true, $SID="") { } else { /* Keyword search (default). */ - $q_where .= construct_keyword_search($dbh, $params['K'], true); + $q_where .= "AND ("; + $q_where .= construct_keyword_search($dbh, $params['K'], true, true); + $q_where .= ") "; } } @@ -833,10 +837,11 @@ function pkg_search_page($params, $show_headers=true, $SID="") { * @param handle $dbh Database handle * @param string $keywords The search term * @param bool $namedesc Search name and description fields + * @param bool $keyword Search packages with a matching PackageBases.Keyword * * @return string WHERE part of the SQL clause */ -function construct_keyword_search($dbh, $keywords, $namedesc) { +function construct_keyword_search($dbh, $keywords, $namedesc, $keyword=false) { $count = 0; $where_part = ""; $q_keywords = ""; @@ -863,11 +868,20 @@ function construct_keyword_search($dbh, $keywords, $namedesc) { $q_keywords .= $op . " ("; if ($namedesc) { $q_keywords .= "Packages.Name LIKE " . $dbh->quote($term) . " OR "; - $q_keywords .= "Description LIKE " . $dbh->quote($term) . " OR "; + $q_keywords .= "Description LIKE " . $dbh->quote($term) . " "; + } + else { + $q_keywords .= "Packages.Name LIKE " . $dbh->quote($term) . " "; + } + + if ($keyword) { + $q_keywords .= "OR EXISTS (SELECT * FROM PackageKeywords WHERE "; + $q_keywords .= "PackageKeywords.PackageBaseID = Packages.PackageBaseID AND "; + $q_keywords .= "PackageKeywords.Keyword LIKE " . $dbh->quote($term) . ")) "; + } + else { + $q_keywords .= ") "; } - $q_keywords .= "EXISTS (SELECT * FROM PackageKeywords WHERE "; - $q_keywords .= "PackageKeywords.PackageBaseID = Packages.PackageBaseID AND "; - $q_keywords .= "PackageKeywords.Keyword LIKE " . $dbh->quote($term) . ")) "; $count++; if ($count >= 20) { @@ -876,11 +890,7 @@ function construct_keyword_search($dbh, $keywords, $namedesc) { $op = "AND "; } - if (!empty($q_keywords)) { - $where_part = "AND (" . $q_keywords . ") "; - } - - return $where_part; + return $q_keywords; } /**