diff mbox

feat(rpc): return "providers" packages when querying by `name` or `name-desc`

Message ID 20180320044957.26489-1-morganamilo@gmail.com
State New
Headers show

Commit Message

morganamilo March 20, 2018, 4:49 a.m. UTC
From: actionless <actionless.loveless@gmail.com>

---
 doc/rpc.txt               |  4 ++--
 web/lib/aurjson.class.php | 16 +++++++++++++++-
 2 files changed, 17 insertions(+), 3 deletions(-)

Comments

morganamilo March 20, 2018, 4:59 a.m. UTC | #1
On 20 March 2018 at 04:49, morganamilo <morganamilo@gmail.com> wrote:
> From: actionless <actionless.loveless@gmail.com>
>
> ---
>  doc/rpc.txt               |  4 ++--
>  web/lib/aurjson.class.php | 16 +++++++++++++++-
>  2 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/doc/rpc.txt b/doc/rpc.txt
> index f353ff0..83cdae3 100644
> --- a/doc/rpc.txt
> +++ b/doc/rpc.txt
> @@ -8,8 +8,8 @@ Package searches can be performed by issuing HTTP GET requests of the form
>  +/rpc/?v=5&type=search&by=_field_&arg=_keywords_+ where _keywords_ is the
>  search argument and _field_ is one of the following values:
>
> -* `name` (search by package name only)
> -* `name-desc` (search by package name and description)
> +* `name` (search by package name or packages which provide that name)
> +* `name-desc` (the same as `name` and search by description)
>  * `maintainer` (search by package maintainer)
>
>  The _by_ parameter can be skipped and defaults to `name-desc`.
> diff --git a/web/lib/aurjson.class.php b/web/lib/aurjson.class.php
> index 9eeaafd..6e580a9 100644
> --- a/web/lib/aurjson.class.php
> +++ b/web/lib/aurjson.class.php
> @@ -392,12 +392,26 @@ class AurJSON {
>                         if (strlen($keyword_string) < 2) {
>                                 return $this->json_error('Query arg too small.');
>                         }
> +
> +                       //packages which provide the package we are looking for:
> +                       $providers = pkg_providers(addcslashes($keyword_string, '%_'));
> +                       $provided_names = array();
> +                       foreach ($providers as $provider) {
> +                               if ($provider[0] != 0) {        // if package is not from repo
> +                                       $name = $this->dbh->quote($provider[1]);
> +                                       array_push($provided_names, $name);
> +                               }
> +                       }
> +                       $provided_query = "(" . join(", ", $provided_names) . ")";
> +
>                         $keyword_string = $this->dbh->quote("%" . addcslashes($keyword_string, '%_') . "%");
>
>                         if ($search_by === 'name') {
> -                               $where_condition = "(Packages.Name LIKE $keyword_string)";
> +                               $where_condition = "(Packages.Name LIKE $keyword_string OR ";
> +                               $where_condition .= "Packages.Name IN $provided_query )";
>                         } else if ($search_by === 'name-desc') {
>                                 $where_condition = "(Packages.Name LIKE $keyword_string OR ";
> +                               $where_condition .= "Packages.Name IN $provided_query OR ";
>                                 $where_condition .= "Description LIKE $keyword_string)";
>                         }
>                 } else if ($search_by === 'maintainer') {
> --
> 2.16.2
>

Sent on behalf of actionless <actionless.loveless@gmail.com>. I have
an intrest in this myself so if he decides he does not want to further
this I wouldn't mind taking it over.
Lukas Fleischer March 20, 2018, 6:36 p.m. UTC | #2
On Tue, 20 Mar 2018 at 05:49:57, morganamilo wrote:
> From: actionless <actionless.loveless@gmail.com>
> 
> ---
>  doc/rpc.txt               |  4 ++--
>  web/lib/aurjson.class.php | 16 +++++++++++++++-
>  2 files changed, 17 insertions(+), 3 deletions(-)
> [...]

Thank you for submitting a patch! Before commenting on implementation
details, I would like to discuss the overall approach, though. What is
the rationale for making it part of name and name-desc? Wouldn't it make
more sense to add a new search type ("providers")?

Regards,
Lukas
Eli Schwartz March 20, 2018, 6:56 p.m. UTC | #3
On 03/20/2018 02:36 PM, Lukas Fleischer wrote:
> On Tue, 20 Mar 2018 at 05:49:57, morganamilo wrote:
>> From: actionless <actionless.loveless@gmail.com>
>>
>> ---
>>  doc/rpc.txt               |  4 ++--
>>  web/lib/aurjson.class.php | 16 +++++++++++++++-
>>  2 files changed, 17 insertions(+), 3 deletions(-)
>> [...]
> 
> Thank you for submitting a patch! Before commenting on implementation
> details, I would like to discuss the overall approach, though. What is
> the rationale for making it part of name and name-desc? Wouldn't it make
> more sense to add a new search type ("providers")?

Well, we have no precedent here as archweb does not allow searching by
provides either AFAICT. But I think, generally speaking, from a
usability perspective any search that is not an exact-name search is
probably *interested* in these provides packages.

pacman -Ss will show them, pacman -S will install them, so the web
interfaces arguably should do the same.

...

Aside: that commit message feels somewhat jarring as it does not match
the (admittedly loose) style in use. I have no idea why this commit is a
"feat" but it feels oddly self-congratulatory, as though the commit is a
circus performer demonstrating a trick.

Am I missing out on the latest fashion or something?
尤立宇 March 21, 2018, 12:47 a.m. UTC | #4
2018-03-21 2:56 GMT+08:00 Eli Schwartz <eschwartz@archlinux.org>:
> > What is
> > the rationale for making it part of name and name-desc? Wouldn't it make
> > more sense to add a new search type ("providers")?
I vote for a new search type "providers".
For my use case, I'd appreciate if I can get precise information from the RPC,
instead of parsing the aurweb html or crawling the entire AUR.
I prefer having a bulk query here so I can keep the request count low.

> Well, we have no precedent here as archweb does not allow searching by
> provides either AFAICT. But I think, generally speaking, from a
> usability perspective any search that is not an exact-name search is
> probably *interested* in these provides packages.

> pacman -Ss will show them, pacman -S will install them, so the web
> interfaces arguably should do the same.

Actually I just found it confusing that:
pacman -Ss 'gcc>7' shows nothing, yet
pacman -S 'gcc>7' will do the right stuff
I'd argue that pacman should provide a search providers. But at least
it has alpm_find_satisifier.
Eli Schwartz March 21, 2018, 12:56 a.m. UTC | #5
On 03/20/2018 08:47 PM, 尤立宇 wrote:
> I vote for a new search type "providers".
> For my use case, I'd appreciate if I can get precise information from the RPC,
> instead of parsing the aurweb html or crawling the entire AUR.
> I prefer having a bulk query here so I can keep the request count low.

So, like, return all the provides results in one search result, then do
another RPC search for multiinfo on all those packages to see exact
details on the pkgbase, provides, etc?

>> pacman -Ss will show them, pacman -S will install them, so the web
>> interfaces arguably should do the same.
> 
> Actually I just found it confusing that:
> pacman -Ss 'gcc>7' shows nothing, yet
> pacman -S 'gcc>7' will do the right stuff
> I'd argue that pacman should provide a search providers. But at least
> it has alpm_find_satisifier.

That's because the >7 is a version specifier, not a search term. On top
of which that is for the main package and has nothing to do with provides.
尤立宇 March 21, 2018, 1:21 a.m. UTC | #6
2018-03-21 8:56 GMT+08:00 Eli Schwartz <eschwartz@archlinux.org>:
> On 03/20/2018 08:47 PM, 尤立宇 wrote:
>> I vote for a new search type "providers".
>> For my use case, I'd appreciate if I can get precise information from the RPC,
>> instead of parsing the aurweb html or crawling the entire AUR.
>> I prefer having a bulk query here so I can keep the request count low.
>
> So, like, return all the provides results in one search result, then do
> another RPC search for multiinfo on all those packages to see exact
> details on the pkgbase, provides, etc?
>
I think it would be like:
HTTP GET /rpc/?v=5&type=provides&arg[]=java-environment&arg[]=libgala.so
{"version":5,"type":"provides","resultcount":2,"results":
 "java-environment": [] // might be list of package names or list of "info"
 "libgala.so": []
}

>>> pacman -Ss will show them, pacman -S will install them, so the web
>>> interfaces arguably should do the same.
>>
>> Actually I just found it confusing that:
>> pacman -Ss 'gcc>7' shows nothing, yet
>> pacman -S 'gcc>7' will do the right stuff
>> I'd argue that pacman should provide a search providers. But at least
>> it has alpm_find_satisifier.
>
> That's because the >7 is a version specifier, not a search term. On top
> of which that is for the main package and has nothing to do with provides.
I got the reasoning about it.
Still, versioned dependencies are valid in the depends array,
so I'd like the provides search in the RPC mentioned above support this,
much like the Dependencies section in the web interface [1].

[1] https://aur.archlinux.org/packages/nvidia-ck/
See how it lists nvidia-utils=390.42, linux-ck-headers>=4.15, etc
morganamilo March 21, 2018, 1:49 a.m. UTC | #7
> I think it would be like:
> HTTP GET /rpc/?v=5&type=provides&arg[]=java-environment&arg[]=libgala.so
> {"version":5,"type":"provides","resultcount":2,"results":
>   "java-environment": [] // might be list of package names or list of "info"
>   "libgala.so": []
> }
Pacman can "-S" and "-Ss" by provides so I would prefer to have the 
option to search by provides in type=info and type=search rather than a 
new type=provides. Also I would like each option to be indivudually 
settable, so "provides", "name" and "desc" can be set indipendantly.
尤立宇 March 21, 2018, 2:33 a.m. UTC | #8
2018-03-21 9:49 GMT+08:00 morganamilo <morganamilo@gmail.com>:
>> I think it would be like:
>> HTTP GET /rpc/?v=5&type=provides&arg[]=java-environment&arg[]=libgala.so
>> {"version":5,"type":"provides","resultcount":2,"results":
>>   "java-environment": [] // might be list of package names or list of
>> "info"
>>   "libgala.so": []
>> }
>
> Pacman can "-S" and "-Ss" by provides so I would prefer to have the option
> to search by provides in type=info and type=search rather than a new
> type=provides. Also I would like each option to be indivudually settable, so
> "provides", "name" and "desc" can be set indipendantly.
I think I'm looking for an alpm_find_satisifier equivalent rather than
a pacman -Ss equalviant,
and that's why I'm proposing a new type=provides.

For type=info after thinking again, is it
a) The dependencies providers' details are listed in the "Depends"
"MakeDepends" "OptDepends"
b) The "results" items are unchanged, but querying a name not only
returns the package name that matches,
but also their provides array matches. That is
?type=info&arg[]=java-environment would return
resultcount > 1
or something else?

I think b) would work for me, I will just have to verify the version
afterwards with vercmp in that case.
morganamilo March 21, 2018, 2:58 a.m. UTC | #9
On 21/03/18 02:33, 尤立宇 wrote:
> 2018-03-21 9:49 GMT+08:00 morganamilo <morganamilo@gmail.com>:
>>> I think it would be like:
>>> HTTP GET /rpc/?v=5&type=provides&arg[]=java-environment&arg[]=libgala.so
>>> {"version":5,"type":"provides","resultcount":2,"results":
>>>    "java-environment": [] // might be list of package names or list of
>>> "info"
>>>    "libgala.so": []
>>> }
>> Pacman can "-S" and "-Ss" by provides so I would prefer to have the option
>> to search by provides in type=info and type=search rather than a new
>> type=provides. Also I would like each option to be indivudually settable, so
>> "provides", "name" and "desc" can be set indipendantly.
> I think I'm looking for an alpm_find_satisifier equivalent rather than
> a pacman -Ss equalviant,
> and that's why I'm proposing a new type=provides.
>
> For type=info after thinking again, is it
> a) The dependencies providers' details are listed in the "Depends"
> "MakeDepends" "OptDepends"
> b) The "results" items are unchanged, but querying a name not only
> returns the package name that matches,
> but also their provides array matches. That is
> ?type=info&arg[]=java-environment would return
> resultcount > 1
> or something else?
>
> I think b) would work for me, I will just have to verify the version
> afterwards with vercmp in that case.

An alpm_find_satisifier equivilent is what I want too, thing is once you get the package names from that alpm_find_satisifier equivelent what are you going to do with thoes names? Probably call "info" on all of them. So why not skip that step and just have info support it.

Yes point b is what I was thinking. info("foo") would return "foo" because it matches by name, and it would also return "foo-git" because it provides "foo". Maybe it would be &type=info&by[]=provides&by[]=name rather than just "info". Point being you still get the same hunk of json that info gets you.

Would your type=provides return just package names or the same stuff info does? If its the latter it's the same as what I want but under a different name.
尤立宇 March 21, 2018, 3:12 a.m. UTC | #10
> An alpm_find_satisifier equivilent is what I want too, thing is once you get
> the package names from that alpm_find_satisifier equivelent what are you
> going to do with thoes names? Probably call "info" on all of them. So why
> not skip that step and just have info support it.
>
> Yes point b is what I was thinking. info("foo") would return "foo" because
> it matches by name, and it would also return "foo-git" because it provides
> "foo". Maybe it would be &type=info&by[]=provides&by[]=name rather than just
> "info". Point being you still get the same hunk of json that info gets you.
>
> Would your type=provides return just package names or the same stuff info
> does? If its the latter it's the same as what I want but under a different
> name.
I'm convinced that just putting that into info would be better.
Looking at the code also shows that adding by[]=provides is easier to implement
as it fits better to the current architecture.
morganamilo March 21, 2018, 3:20 a.m. UTC | #11
On 21/03/18 03:12, 尤立宇 wrote:
>> An alpm_find_satisifier equivilent is what I want too, thing is once you get
>> the package names from that alpm_find_satisifier equivelent what are you
>> going to do with thoes names? Probably call "info" on all of them. So why
>> not skip that step and just have info support it.
>>
>> Yes point b is what I was thinking. info("foo") would return "foo" because
>> it matches by name, and it would also return "foo-git" because it provides
>> "foo". Maybe it would be &type=info&by[]=provides&by[]=name rather than just
>> "info". Point being you still get the same hunk of json that info gets you.
>>
>> Would your type=provides return just package names or the same stuff info
>> does? If its the latter it's the same as what I want but under a different
>> name.
> I'm convinced that just putting that into info would be better.
> Looking at the code also shows that adding by[]=provides is easier to implement
> as it fits better to the current architecture.
Glad you agree. This patch was more about opening up discussion before 
going in and changing a bunch of stuff. Is there any offial decision by 
the devs and a little guidence on how the implementation should look?
Lukas Fleischer March 21, 2018, 6:44 a.m. UTC | #12
On Wed, 21 Mar 2018 at 04:20:18, morganamilo wrote:
> On 21/03/18 03:12, 尤立宇 wrote:
> >> An alpm_find_satisifier equivilent is what I want too, thing is once you get
> >> the package names from that alpm_find_satisifier equivelent what are you
> >> going to do with thoes names? Probably call "info" on all of them. So why
> >> not skip that step and just have info support it.
> >>
> >> Yes point b is what I was thinking. info("foo") would return "foo" because
> >> it matches by name, and it would also return "foo-git" because it provides
> >> "foo". Maybe it would be &type=info&by[]=provides&by[]=name rather than just
> >> "info". Point being you still get the same hunk of json that info gets you.
> >>
> >> Would your type=provides return just package names or the same stuff info
> >> does? If its the latter it's the same as what I want but under a different
> >> name.
> > I'm convinced that just putting that into info would be better.
> > Looking at the code also shows that adding by[]=provides is easier to implement
> > as it fits better to the current architecture.
> Glad you agree. This patch was more about opening up discussion before 
> going in and changing a bunch of stuff. Is there any offial decision by 
> the devs and a little guidence on how the implementation should look?

As I already suggested before, I like the idea of adding multiple "by"
arguments to the search. I wonder what to do with info queries. If we do
the same thing there, wouldn't it be natural to add all the "by" types
from search as well? Maybe info and search could/should be able to
handle the same kind of queries and only differ in the amount of details
returned...
尤立宇 March 21, 2018, 8:10 a.m. UTC | #13
>> > I'm convinced that just putting that into info would be better.
>> > Looking at the code also shows that adding by[]=provides is easier to implement
>> > as it fits better to the current architecture.
>> Glad you agree. This patch was more about opening up discussion before
>> going in and changing a bunch of stuff. Is there any offial decision by
>> the devs and a little guidence on how the implementation should look?
>
> As I already suggested before, I like the idea of adding multiple "by"
> arguments to the search. I wonder what to do with info queries. If we do
> the same thing there, wouldn't it be natural to add all the "by" types
> from search as well? Maybe info and search could/should be able to
> handle the same kind of queries and only differ in the amount of details
> returned...

From my point of view, the search should look for the queried
substring in the fields,
while info looks for exact match. My study on the code shows the only
difference now
is how the $where_condition is built [1] and the level of detail is the same.

Here's some code highlights (v5 only). Except maintainer, search
performs a LIKE query
and info performs a IN query. Also, search only supports single
argument, while info supports multiple.

search/name:
"(Packages.Name LIKE $keyword_string)"
search/name-desc:
"(Packages.Name LIKE $keyword_string OR Description LIKE $keyword_string)"
search/maintainer:
"Users.Username = $keyword_string "
info:
"Packages.Name IN ($names_value) "

To support multiple "by" arguments, I came up with this, assuming by=by1&by=by2
search: any by fields contain the argument
WHERE (Packages.By1 LILKE $keyword_string OR Packages.By2 LIKE $keyword_string)
info: any by fields equals one of the argument
WHERE (Packages.By1 IN ($names_value) OR Packages.By2 IN ($names_value)
I think this way more fields can be easily supported.

With this in mind,
search by=name-desc would become by=name&by=desc
search by=maintainer would become info by=maintainer.
Does this make sense to you?

[1]
search: https://git.archlinux.org/aurweb.git/tree/web/lib/aurjson.class.php#n412
multiinfo: https://git.archlinux.org/aurweb.git/tree/web/lib/aurjson.class.php#n467
Lukas Fleischer March 29, 2018, 5:23 a.m. UTC | #14
On Wed, 21 Mar 2018 at 09:10:10, 尤立宇 wrote:
> [...]
> To support multiple "by" arguments, I came up with this, assuming by=by1&by=by2
> search: any by fields contain the argument
> WHERE (Packages.By1 LILKE $keyword_string OR Packages.By2 LIKE $keyword_string)
> info: any by fields equals one of the argument
> WHERE (Packages.By1 IN ($names_value) OR Packages.By2 IN ($names_value)
> I think this way more fields can be easily supported.
> 
> With this in mind,
> search by=name-desc would become by=name&by=desc
> search by=maintainer would become info by=maintainer.
> Does this make sense to you?
> [...]

Yes, sounds reasonable. Note that we still need to support the old
implementation and should probably create a new version of the RPC API.

Regards,
Lukas
diff mbox

Patch

diff --git a/doc/rpc.txt b/doc/rpc.txt
index f353ff0..83cdae3 100644
--- a/doc/rpc.txt
+++ b/doc/rpc.txt
@@ -8,8 +8,8 @@  Package searches can be performed by issuing HTTP GET requests of the form
 +/rpc/?v=5&type=search&by=_field_&arg=_keywords_+ where _keywords_ is the
 search argument and _field_ is one of the following values:
 
-* `name` (search by package name only)
-* `name-desc` (search by package name and description)
+* `name` (search by package name or packages which provide that name)
+* `name-desc` (the same as `name` and search by description)
 * `maintainer` (search by package maintainer)
 
 The _by_ parameter can be skipped and defaults to `name-desc`.
diff --git a/web/lib/aurjson.class.php b/web/lib/aurjson.class.php
index 9eeaafd..6e580a9 100644
--- a/web/lib/aurjson.class.php
+++ b/web/lib/aurjson.class.php
@@ -392,12 +392,26 @@  class AurJSON {
 			if (strlen($keyword_string) < 2) {
 				return $this->json_error('Query arg too small.');
 			}
+
+			//packages which provide the package we are looking for:
+			$providers = pkg_providers(addcslashes($keyword_string, '%_'));
+			$provided_names = array();
+			foreach ($providers as $provider) {
+				if ($provider[0] != 0) {	// if package is not from repo
+					$name = $this->dbh->quote($provider[1]);
+					array_push($provided_names, $name);
+				}
+			}
+			$provided_query = "(" . join(", ", $provided_names) . ")";
+
 			$keyword_string = $this->dbh->quote("%" . addcslashes($keyword_string, '%_') . "%");
 
 			if ($search_by === 'name') {
-				$where_condition = "(Packages.Name LIKE $keyword_string)";
+				$where_condition = "(Packages.Name LIKE $keyword_string OR ";
+				$where_condition .= "Packages.Name IN $provided_query )";
 			} else if ($search_by === 'name-desc') {
 				$where_condition = "(Packages.Name LIKE $keyword_string OR ";
+				$where_condition .= "Packages.Name IN $provided_query OR ";
 				$where_condition .= "Description LIKE $keyword_string)";
 			}
 		} else if ($search_by === 'maintainer') {