github.com/benhoyt/goawk@v1.8.1/testdata/gawk/gsubtst5.awk (about)

     1  # From jose@monkey.org  Thu Jun  5 11:48:35 2003
     2  # Return-Path: <jose@monkey.org>
     3  # Received: from localhost (skeeve [127.0.0.1])
     4  # 	by skeeve.com (8.12.5/8.12.5) with ESMTP id h558eVvA012655
     5  # 	for <arnold@localhost>; Thu, 5 Jun 2003 11:48:35 +0300
     6  # Received: from actcom.co.il [192.114.47.1]
     7  # 	by localhost with POP3 (fetchmail-5.9.0)
     8  # 	for arnold@localhost (single-drop); Thu, 05 Jun 2003 11:48:35 +0300 (IDT)
     9  # Received: by actcom.co.il (mbox arobbins)
    10  #  (with Cubic Circle's cucipop (v1.31 1998/05/13) Thu Jun  5 11:47:59 2003)
    11  # X-From_: jose@monkey.org Thu Jun  5 07:14:45 2003
    12  # Received: from smtp1.actcom.net.il by actcom.co.il  with ESMTP
    13  # 	(8.11.6/actcom-0.2) id h554EdY08108 for <arobbins@actcom.co.il>;
    14  # 	Thu, 5 Jun 2003 07:14:41 +0300 (EET DST)  
    15  # 	(rfc931-sender: smtp.actcom.co.il [192.114.47.13])
    16  # Received: from f7.net (consort.superb.net [209.61.216.22])
    17  # 	by smtp1.actcom.net.il (8.12.8/8.12.8) with ESMTP id h554G3To008304
    18  # 	for <arobbins@actcom.co.il>; Thu, 5 Jun 2003 07:16:05 +0300
    19  # Received: from fencepost.gnu.org (fencepost.gnu.org [199.232.76.164])
    20  # 	by f7.net (8.11.7/8.11.6) with ESMTP id h554Ean08172
    21  # 	for <arnold@skeeve.com>; Thu, 5 Jun 2003 00:14:36 -0400
    22  # Received: from monty-python.gnu.org ([199.232.76.173])
    23  # 	by fencepost.gnu.org with esmtp (Exim 4.20)
    24  # 	id 19Nm96-0001xE-1i
    25  # 	for arnold@gnu.ai.mit.edu; Thu, 05 Jun 2003 00:14:36 -0400
    26  # Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.20)
    27  # 	id 19Nm8x-0005ge-Dz
    28  # 	for arnold@gnu.ai.mit.edu; Thu, 05 Jun 2003 00:14:28 -0400
    29  # Received: from naughty.monkey.org ([66.93.9.164])
    30  # 	by monty-python.gnu.org with esmtp (Exim 4.20)
    31  # 	id 19Nm8w-0005VM-Ko
    32  # 	for arnold@gnu.ai.mit.edu; Thu, 05 Jun 2003 00:14:26 -0400
    33  # Received: by naughty.monkey.org (Postfix, from userid 1203)
    34  # 	id C15511BA97B; Thu,  5 Jun 2003 00:14:19 -0400 (EDT)
    35  # Received: from localhost (localhost [127.0.0.1])
    36  # 	by naughty.monkey.org (Postfix) with ESMTP
    37  # 	id BF9821BA969; Thu,  5 Jun 2003 00:14:19 -0400 (EDT)
    38  # Date: Thu, 5 Jun 2003 00:14:19 -0400 (EDT)
    39  # From: Jose Nazario <jose@monkey.org>
    40  # To: bug-gnu-utils@prep.ai.mit.edu, arnold@gnu.ai.mit.edu,
    41  #    netbsd-bugs@netbsd.org
    42  # Subject: bug in gawk/gsub() (not present in nawk)
    43  # Message-ID: <Pine.BSO.4.51.0306050007160.31577@naughty.monkey.org>
    44  # MIME-Version: 1.0
    45  # Content-Type: TEXT/PLAIN; charset=US-ASCII
    46  # X-Spam-Status: No, hits=-1.2 required=5.0
    47  # 	tests=SPAM_PHRASE_00_01,USER_AGENT_PINE
    48  # 	version=2.41
    49  # X-Spam-Level: 
    50  # X-SpamBouncer: 1.4 (10/07/01)
    51  # X-SBClass: OK
    52  # Status: R
    53  # 
    54  # while playing with some tools in data massaging, i had to migrate from an
    55  # openbsd/nawk system to a netbsd/gawk system. i found the folllowing
    56  # behavior, which seems to be a bug.
    57  # 
    58  # the following gsub() pattern has a strange effect under gawk which is not
    59  # visible in nawk (at least as compiled on openbsd). the intention is to
    60  # take a string like "This Is a Title: My Title?" and turn it into a
    61  # normalized string: "ThisIsaTitleMyTitle". to do this, i wrote the
    62  # following gross gsub line in an awk script:
    63  # 
    64  # 	gsub(/[\ \"-\/\\:;\[\]\@\?\.\,\$]/, "", $2)
    65  # 	print $2
    66  # 
    67  # in gawk, as found in netbsd-macppc/1.5.2, this will drop the first letter
    68  # of every word. the resulting string will be "hissitleyitle", while in nawk
    69  # as built on openbsd-3.3 this will get it correct.
    70  # 
    71  # any insights? the inconsistency with this relatively naive pattern seems a
    72  # bit odd. (i would up installing nawk built from openbsd sources.)
    73  # 
    74  # thanks. sorry i didn't send a better bug report, netbsd folks, i'm not
    75  # much of a netbsd user, and i dont have send-pr set up. yes, this is a
    76  # slightly older version of netbsd and gawk:
    77  # 
    78  # $ uname -a
    79  # NetBSD entropy 1.5.2 NetBSD 1.5.2 (GENERIC) #0: Sun Feb 10 02:00:04 EST
    80  # 2002     jose@entropy:/usr/src/sys/arch/macppc/compile/GENERIC macppc
    81  # $ awk --version
    82  # GNU Awk 3.0.3
    83  # Copyright (C) 1989, 1991-1997 Free Software Foundation.
    84  # 
    85  # 
    86  # 
    87  # thanks.
    88  # 
    89  # ___________________________
    90  # jose nazario, ph.d.			jose@monkey.org
    91  # 					http://monkey.org/~jose/
    92  # 
    93  # 
    94  {
    95  	gsub(/[\ \"-\/\\:;\[\]\@\?\.\,\$]/, "")
    96   	print
    97  }