From patchwork Mon Jul 6 00:57:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Morris X-Patchwork-Id: 1709 Return-Path: Delivered-To: patchwork@archlinux.org Received: from apollo.archlinux.org (localhost [127.0.0.1]) by apollo.archlinux.org (Postfix) with ESMTP id 6D30819C18A7E for ; Mon, 6 Jul 2020 00:57:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on apollo.archlinux.org X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=DKIM_ADSP_CUSTOM_MED=0.001, DKIM_INVALID=1,DKIM_SIGNED=0.1,FREEMAIL_FROM=0.5,MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_MED=-2.3,RCVD_IN_MSPIKE_H4=0.001,RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001,T_DMARC_POLICY_NONE=0.01,T_DMARC_SIMPLE_DKIM=0.01 autolearn=ham autolearn_force=no version=3.4.4 X-Spam-BL-Results: [127.0.0.19] [127.0.9.2] Received: from orion.archlinux.org (orion.archlinux.org [88.198.91.70]) by apollo.archlinux.org (Postfix) with ESMTPS for ; Mon, 6 Jul 2020 00:57:57 +0000 (UTC) Received: from orion.archlinux.org (localhost [127.0.0.1]) by orion.archlinux.org (Postfix) with ESMTP id 997B41D358A4B6; Mon, 6 Jul 2020 00:57:50 +0000 (UTC) Received: from luna.archlinux.org (luna.archlinux.org [IPv6:2a01:4f8:160:3033::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits)) (No client certificate requested) (Authenticated sender: luna) by orion.archlinux.org (Postfix) with ESMTPSA id 694861D358A4B0; Mon, 6 Jul 2020 00:57:50 +0000 (UTC) Authentication-Results: orion.archlinux.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=inSsm8N5 Received: from luna.archlinux.org (luna.archlinux.org [127.0.0.1]) by luna.archlinux.org (Postfix) with ESMTP id 54DC229CB6; Mon, 6 Jul 2020 00:57:50 +0000 (UTC) Authentication-Results: luna.archlinux.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=inSsm8N5 Received: from luna.archlinux.org (luna.archlinux.org [127.0.0.1]) by luna.archlinux.org (Postfix) with ESMTP id B21FB29CB4 for ; Mon, 6 Jul 2020 00:57:46 +0000 (UTC) Received: from orion.archlinux.org (orion.archlinux.org [IPv6:2a01:4f8:160:6087::1]) by luna.archlinux.org (Postfix) with ESMTPS for ; Mon, 6 Jul 2020 00:57:46 +0000 (UTC) Received: from orion.archlinux.org (localhost [127.0.0.1]) by orion.archlinux.org (Postfix) with ESMTP id A84FC1D358A4AA for ; Mon, 6 Jul 2020 00:57:41 +0000 (UTC) Received: from mail-pj1-x1042.google.com (mail-pj1-x1042.google.com [IPv6:2607:f8b0:4864:20::1042]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by orion.archlinux.org (Postfix) with ESMTPS for ; Mon, 6 Jul 2020 00:57:41 +0000 (UTC) Received: by mail-pj1-x1042.google.com with SMTP id mn17so2752137pjb.4 for ; Sun, 05 Jul 2020 17:57:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=lwN58cRUUGlK25khamS+Bx1KNWbT9msn94UKAHXig5A=; b=inSsm8N5WX70y67mCCS1J3UkwxlSevBEM3ycOKnNo3R1K3uKn5ltbzoPpyFF+2kxxC hOqYgfdgDLpsPqbl2Z0ML1A/WXPSXIyFaAgGu6U6HGEJ8xUmOeju9atKYjuskcZf/4vI CDrY84Jv854w8zySPZ7agTSbPl0Uaa6vFnaBIbcg8IR0eG910WbqZ02grijyojropAXp SD75Kgr2sehUhsVbs+iJurwYPWjn3CJs2GTAMghKlfO/d3gVXd/dZIIP0YCOsuku4Lnx DA8cr+HNg6GS24fAdlXbEWDIQP/yi8iYkMKObP3oMIjKasY9mzkOJIiQAsBJTKJxiEIU CLMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lwN58cRUUGlK25khamS+Bx1KNWbT9msn94UKAHXig5A=; b=uXZi7MlYPzMfcsgqWofarDyzUUdGlQebvMJM1a1MmQ9R9N4qWMWh0WZV0eB8TJqN8b q0u5D0GpTV2tkKB2RJkoErnYVkAYUBMwuOCkE7N1kNAD/xlZIcAo2rmyBm6/syGi6+6H VtZgJb/fQ/T8LtjAwpy0cbYJx6oRTCihh4XgR3wRf09+Hj259S2eWx/s+hpFPxBtJL81 vPw3kWGoNu+a8WMqGXYqypkK4j0yrtZcwJjFm09lcYaxv6Fj48pYGwtSUj/L2xHSOjox vSSumJng6BBESq72DuXXlthp6Xsp4QFEfcTXH9NrI/pkngxcPIdsW4tLV24SEP6k2Nv7 1hvA== X-Gm-Message-State: AOAM530nP9zYIcrJPf9fonOYzy2qtNEBmP/EVLGMZERTuhjudCeZZYKU fOzxEc1B6nyDCXQHDrHlCJf96MsUWRM= X-Google-Smtp-Source: ABdhPJxjuF61y1LuJQmoGoH3K95FEUTQi7WV1NYt0C8VoEqFeJri6D0et9LScvXn5xBND9xxFF91Sw== X-Received: by 2002:a17:90a:1f87:: with SMTP id x7mr18921580pja.101.1593997059099; Sun, 05 Jul 2020 17:57:39 -0700 (PDT) Received: from localhost.localdomain ([2600:1:9a05:3a11:14d2:c57b:c90c:fa96]) by smtp.gmail.com with ESMTPSA id q11sm10282092pgr.69.2020.07.05.17.57.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Jul 2020 17:57:38 -0700 (PDT) From: Kevin Morris To: aur-dev@archlinux.org Subject: [PATCH v2] Support conjunctive keyword search in RPC interface Date: Sun, 5 Jul 2020 17:57:26 -0700 Message-Id: <20200706005726.24915-1-kevr.gtalk@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: References: MIME-Version: 1.0 X-BeenThere: aur-dev@archlinux.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: "Arch User Repository \(AUR\) Development" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: aur-dev-bounces@archlinux.org Sender: "aur-dev" Newly supported API Version 6 modifies `type=search` functionality; it now behaves the same as `name` or `name-desc` search through the https://aur.archlinux.org/packages/ search page. Search for packages containing the literal keyword `blah blah` AND `haha`: https://aur.archlinux.org/rpc/?v=6&type=search&arg="blah blah"%20haha Search for packages containing the literal keyword `abc 123`: https://aur.archlinux.org/rpc/?v=6&type=search&arg="abc 123" The following example searches for packages that contain `blah` AND `abc`: https://aur.archlinux.org/rpc/?v=6&type=search&arg=blah%20abc The legacy method still searches for packages that contain `blah abc`: https://aur.archlinux.org/rpc/?v=5&type=search&arg=blah%20abc https://aur.archlinux.org/rpc/?v=5&type=search&arg=blah%20abc API Version 6 is currently only considered during a `search` of `name` or `name-desc`. Additionally, this change adds support for conjuctive search when searching by `name` in https://aur.archlinux.org/packages/. Note: This change was written as a solution to https://bugs.archlinux.org/task/49133. PS: + Some spacing issues fixed in comments. Signed-off-by: Kevin Morris --- doc/rpc.txt | 4 ++++ web/lib/aurjson.class.php | 34 ++++++++++++++++++++++---------- web/lib/pkgfuncs.inc.php | 41 +++++++++++++++++++++++++-------------- 3 files changed, 54 insertions(+), 25 deletions(-) diff --git a/doc/rpc.txt b/doc/rpc.txt index 3148ebea..b0f5c4e1 100644 --- a/doc/rpc.txt +++ b/doc/rpc.txt @@ -39,6 +39,10 @@ Examples `/rpc/?v=5&type=search&by=makedepends&arg=boost` `search` with callback:: `/rpc/?v=5&type=search&arg=foobar&callback=jsonp1192244621103` +`search` with API Version 6 for packages containing `cookie` AND `milk`:: + `/rpc/?v=6&type=search&arg=cookie%20milk` +`search` with API Version 6 for packages containing `cookie milk`:: + `/rpc/?v=6&type=search&arg="cookie milk"` `info`:: `/rpc/?v=5&type=info&arg[]=foobar` `info` with multiple packages:: diff --git a/web/lib/aurjson.class.php b/web/lib/aurjson.class.php index 0ac586fe..da1af9be 100644 --- a/web/lib/aurjson.class.php +++ b/web/lib/aurjson.class.php @@ -80,7 +80,7 @@ class AurJSON { if (isset($http_data['v'])) { $this->version = intval($http_data['v']); } - if ($this->version < 1 || $this->version > 5) { + if ($this->version < 1 || $this->version > 6) { return $this->json_error('Invalid version specified.'); } @@ -140,7 +140,7 @@ class AurJSON { } /* - * Check if an IP needs to be rate limited. + * Check if an IP needs to be rate limited. * * @param $ip IP of the current request * @@ -192,7 +192,7 @@ class AurJSON { $value = get_cache_value('ratelimit-ws:' . $ip, $status); if (!$status || ($status && $value < $deletion_time)) { if (set_cache_value('ratelimit-ws:' . $ip, $time, $window_length) && - set_cache_value('ratelimit:' . $ip, 1, $window_length)) { + set_cache_value('ratelimit:' . $ip, 1, $window_length)) { return; } } else { @@ -370,7 +370,7 @@ class AurJSON { } elseif ($this->version >= 2) { if ($this->version == 2 || $this->version == 3) { $fields = implode(',', self::$fields_v2); - } else if ($this->version == 4 || $this->version == 5) { + } else if ($this->version >= 4 && $this->version <= 6) { $fields = implode(',', self::$fields_v4); } $query = "SELECT {$fields} " . @@ -492,13 +492,27 @@ class AurJSON { if (strlen($keyword_string) < 2) { return $this->json_error('Query arg too small.'); } - $keyword_string = $this->dbh->quote("%" . addcslashes($keyword_string, '%_') . "%"); - if ($search_by === 'name') { - $where_condition = "(Packages.Name LIKE $keyword_string)"; - } else if ($search_by === 'name-desc') { - $where_condition = "(Packages.Name LIKE $keyword_string OR "; - $where_condition .= "Description LIKE $keyword_string)"; + if ($this->version >= 6) { + + // API Version 6. + $namedesc = $search_by === 'name-desc'; + $where_condition = construct_keyword_search($this->dbh, + $keyword_string, $namedesc, false); + + } else { + + // API Version 5 and below. + $keyword_string = $this->dbh->quote( + "%" . addcslashes($keyword_string, '%_') . "%"); + + if ($search_by === 'name') { + $where_condition = "(Packages.Name LIKE $keyword_string)"; + } else if ($search_by === 'name-desc') { + $where_condition = "(Packages.Name LIKE $keyword_string "; + $where_condition .= "OR Description LIKE $keyword_string)"; + } + } } else if ($search_by === 'maintainer') { if (empty($keyword_string)) { diff --git a/web/lib/pkgfuncs.inc.php b/web/lib/pkgfuncs.inc.php index 8c915711..e56fef77 100644 --- a/web/lib/pkgfuncs.inc.php +++ b/web/lib/pkgfuncs.inc.php @@ -687,8 +687,9 @@ function pkg_search_page($params, $show_headers=true, $SID="") { } elseif (isset($params["SeB"]) && $params["SeB"] == "n") { /* Search by name. */ - $K = "%" . addcslashes($params['K'], '%_') . "%"; - $q_where .= "AND (Packages.Name LIKE " . $dbh->quote($K) . ") "; + $q_where .= "AND ("; + $q_where .= construct_keyword_search($dbh, $params['K'], false, false); + $q_where .= ") "; } elseif (isset($params["SeB"]) && $params["SeB"] == "b") { /* Search by package base name. */ @@ -696,8 +697,10 @@ function pkg_search_page($params, $show_headers=true, $SID="") { $q_where .= "AND (PackageBases.Name LIKE " . $dbh->quote($K) . ") "; } elseif (isset($params["SeB"]) && $params["SeB"] == "k") { - /* Search by keywords. */ - $q_where .= construct_keyword_search($dbh, $params['K'], false); + /* Search by name. */ + $q_where .= "AND ("; + $q_where .= construct_keyword_search($dbh, $params['K'], false, true); + $q_where .= ") "; } elseif (isset($params["SeB"]) && $params["SeB"] == "N") { /* Search by name (exact match). */ @@ -709,7 +712,9 @@ function pkg_search_page($params, $show_headers=true, $SID="") { } else { /* Keyword search (default). */ - $q_where .= construct_keyword_search($dbh, $params['K'], true); + $q_where .= "AND ("; + $q_where .= construct_keyword_search($dbh, $params['K'], true, true); + $q_where .= ") "; } } @@ -833,10 +838,11 @@ function pkg_search_page($params, $show_headers=true, $SID="") { * @param handle $dbh Database handle * @param string $keywords The search term * @param bool $namedesc Search name and description fields + * @param bool $keyword Search packages with a matching PackageBases.Keyword * * @return string WHERE part of the SQL clause */ -function construct_keyword_search($dbh, $keywords, $namedesc) { +function construct_keyword_search($dbh, $keywords, $namedesc, $keyword=false) { $count = 0; $where_part = ""; $q_keywords = ""; @@ -863,11 +869,20 @@ function construct_keyword_search($dbh, $keywords, $namedesc) { $q_keywords .= $op . " ("; if ($namedesc) { $q_keywords .= "Packages.Name LIKE " . $dbh->quote($term) . " OR "; - $q_keywords .= "Description LIKE " . $dbh->quote($term) . " OR "; + $q_keywords .= "Description LIKE " . $dbh->quote($term) . " "; + } + else { + $q_keywords .= "Packages.Name LIKE " . $dbh->quote($term) . " "; + } + + if ($keyword) { + $q_keywords .= "OR EXISTS (SELECT * FROM PackageKeywords WHERE "; + $q_keywords .= "PackageKeywords.PackageBaseID = Packages.PackageBaseID AND "; + $q_keywords .= "PackageKeywords.Keyword LIKE " . $dbh->quote($term) . ")) "; + } + else { + $q_keywords .= ") "; } - $q_keywords .= "EXISTS (SELECT * FROM PackageKeywords WHERE "; - $q_keywords .= "PackageKeywords.PackageBaseID = Packages.PackageBaseID AND "; - $q_keywords .= "PackageKeywords.Keyword LIKE " . $dbh->quote($term) . ")) "; $count++; if ($count >= 20) { @@ -876,11 +891,7 @@ function construct_keyword_search($dbh, $keywords, $namedesc) { $op = "AND "; } - if (!empty($q_keywords)) { - $where_part = "AND (" . $q_keywords . ") "; - } - - return $where_part; + return $q_keywords; } /**