From patchwork Sat Feb 1 18:07:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Fr=C3=A9d=C3=A9ric_Mangano-Tarumi?= X-Patchwork-Id: 1486 Return-Path: Delivered-To: patchwork@archlinux.org Received: from apollo.archlinux.org (localhost [127.0.0.1]) by apollo.archlinux.org (Postfix) with ESMTP id DFA8B16C80449 for ; Sat, 1 Feb 2020 18:07:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.3 (2019-12-06) on apollo.archlinux.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=DKIM_INVALID=1, DKIM_SIGNED=0.1,MAILING_LIST_MULTI=-1,RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001 autolearn=ham autolearn_force=no version=3.4.3 X-Spam-BL-Results: [127.0.9.2] Received: from orion.archlinux.org (orion.archlinux.org [88.198.91.70]) by apollo.archlinux.org (Postfix) with ESMTPS for ; Sat, 1 Feb 2020 18:07:55 +0000 (UTC) Received: from orion.archlinux.org (localhost [127.0.0.1]) by orion.archlinux.org (Postfix) with ESMTP id 4D99618929BFCE; Sat, 1 Feb 2020 18:07:54 +0000 (UTC) Received: from luna.archlinux.org (luna.archlinux.org [5.9.250.164]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits)) (No client certificate requested) (Authenticated sender: luna) by orion.archlinux.org (Postfix) with ESMTPSA id F3FB818929BFC8; Sat, 1 Feb 2020 18:07:53 +0000 (UTC) Authentication-Results: orion.archlinux.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=mg0.fr header.i=@mg0.fr header.b=QmKMvoDW Received: from luna.archlinux.org (luna.archlinux.org [127.0.0.1]) by luna.archlinux.org (Postfix) with ESMTP id DEABE2069D; Sat, 1 Feb 2020 18:07:53 +0000 (UTC) Authentication-Results: luna.archlinux.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=mg0.fr header.i=@mg0.fr header.b=QmKMvoDW Received: from luna.archlinux.org (luna.archlinux.org [127.0.0.1]) by luna.archlinux.org (Postfix) with ESMTP id DE31C2068F for ; Sat, 1 Feb 2020 18:07:50 +0000 (UTC) Received: from orion.archlinux.org (orion.archlinux.org [IPv6:2a01:4f8:160:6087::1]) by luna.archlinux.org (Postfix) with ESMTPS for ; Sat, 1 Feb 2020 18:07:50 +0000 (UTC) Received: from orion.archlinux.org (localhost [127.0.0.1]) by orion.archlinux.org (Postfix) with ESMTP id 0610118929BFC4 for ; Sat, 1 Feb 2020 18:07:42 +0000 (UTC) Received: from tsubame.mg0.fr (tsubame.mg0.fr [IPv6:2001:41d0:401:3100::402b]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by orion.archlinux.org (Postfix) with ESMTPS for ; Sat, 1 Feb 2020 18:07:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mg0.fr; s=tsubame; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID: Subject:To:From:Date:Sender:Reply-To:Cc:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=FIGxn0eCtgFKabobJHBmnMPuexmjIuUlaVvQoFTcdD4=; b=QmKMvoDWw6BX0RulbMjuG/bTKY VkE7AhiPe03TP7Q6DiDe+vBnv+rqnyAZSHM99+23YD6PHRA+BgnYsangq+nTu+dw1toWJ9zjVjhft JtyBs4ybGyc89/eA8PqUdSls+UBxw0kycH9uNK8aH21GkFMZbv4fO5rg+VkbFYnjUs5A=; Received: from fmang by tsubame.mg0.fr with local (Exim 4.93) (envelope-from ) id 1ixxBF-000auZ-Fp for aur-dev@archlinux.org; Sat, 01 Feb 2020 19:07:41 +0100 Date: Sat, 1 Feb 2020 19:07:41 +0100 From: =?utf-8?b?RnLDqWTDqXJpYw==?= Mangano-Tarumi To: aur-dev@archlinux.org Subject: [PATCH] rendercomment: safer auto-linkification of URLs Message-ID: <20200201180741.GA141839@tsubame.mg0.fr> MIME-Version: 1.0 Content-Disposition: inline X-BeenThere: aur-dev@archlinux.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Arch User Repository \(AUR\) Development" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: aur-dev-bounces@archlinux.org Sender: "aur-dev" Fixes a few edge cases: - URLs within code blocks used to get redundant <> added, breaking bash code snippets like `curl https://...` into `curl `. - Links written with markdown’s syntax also used to get an extra pair of brackets. --- aurweb/scripts/rendercomment.py | 19 +++++++++++-------- test/t2600-rendercomment.sh | 15 +++++++++++++-- 2 files changed, 24 insertions(+), 10 deletions(-) diff --git a/aurweb/scripts/rendercomment.py b/aurweb/scripts/rendercomment.py index ad39ceb..ba28486 100755 --- a/aurweb/scripts/rendercomment.py +++ b/aurweb/scripts/rendercomment.py @@ -13,17 +13,20 @@ repo_path = aurweb.config.get('serve', 'repo-path') commit_uri = aurweb.config.get('options', 'commit_uri') -class LinkifyPreprocessor(markdown.preprocessors.Preprocessor): - _urlre = re.compile(r'(\b(?:https?|ftp):\/\/[\w\/\#~:.?+=&%@!\-;,]+?' - r'(?=[.:?\-;,]*(?:[^\w\/\#~:.?+=&%@!\-;,]|$)))') - - def run(self, lines): - return [self._urlre.sub(r'<\1>', line) for line in lines] +class LinkifyExtension(markdown.extensions.Extension): + """ + Turn URLs into links, even without explicit markdown. + Do not linkify URLs in code blocks. + """ + # Captures http(s) and ftp URLs until the first non URL-ish character. + # Excludes trailing punctuation. + _urlre = (r'(\b(?:https?|ftp):\/\/[\w\/\#~:.?+=&%@!\-;,]+?' + r'(?=[.:?\-;,]*(?:[^\w\/\#~:.?+=&%@!\-;,]|$)))') -class LinkifyExtension(markdown.extensions.Extension): def extendMarkdown(self, md, md_globals): - md.preprocessors.add('linkify', LinkifyPreprocessor(md), '_end') + processor = markdown.inlinepatterns.AutolinkInlineProcessor(self._urlre, md) + md.inlinePatterns.add('linkify', processor, '_end') class FlysprayLinksPreprocessor(markdown.preprocessors.Preprocessor): diff --git a/test/t2600-rendercomment.sh b/test/t2600-rendercomment.sh index 7b3a4a8..b0209eb 100755 --- a/test/t2600-rendercomment.sh +++ b/test/t2600-rendercomment.sh @@ -51,11 +51,22 @@ test_expect_success 'Test HTML sanitizing.' ' test_expect_success 'Test link conversion.' ' cat <<-EOD | sqlite3 aur.db && - INSERT INTO PackageComments (ID, PackageBaseID, Comments, RenderedComment) VALUES (4, 1, "Visit https://www.archlinux.org/.", ""); + INSERT INTO PackageComments (ID, PackageBaseID, Comments, RenderedComment) VALUES (4, 1, " + Visit https://www.archlinux.org/. + Visit . + Visit \`https://www.archlinux.org/\`. + Visit [Arch Linux](https://www.archlinux.org/). + Visit [Arch Linux][arch]. + [arch]: https://www.archlinux.org/ + ", ""); EOD "$RENDERCOMMENT" 4 && cat <<-EOD >expected && -

Visit https://www.archlinux.org/.

+

Visit https://www.archlinux.org/. + Visit https://www.archlinux.org/. + Visit https://www.archlinux.org/. + Visit Arch Linux. + Visit Arch Linux.

EOD cat <<-EOD | sqlite3 aur.db >actual && SELECT RenderedComment FROM PackageComments WHERE ID = 4;