github.com/apache/beam/sdks/v2@v2.48.2/go/README.md (about) 1 <!-- 2 Licensed to the Apache Software Foundation (ASF) under one 3 or more contributor license agreements. See the NOTICE file 4 distributed with this work for additional information 5 regarding copyright ownership. The ASF licenses this file 6 to you under the Apache License, Version 2.0 (the 7 "License"); you may not use this file except in compliance 8 with the License. You may obtain a copy of the License at 9 10 http://www.apache.org/licenses/LICENSE-2.0 11 12 Unless required by applicable law or agreed to in writing, 13 software distributed under the License is distributed on an 14 "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY 15 KIND, either express or implied. See the License for the 16 specific language governing permissions and limitations 17 under the License. 18 --> 19 20 # Go SDK 21 22 The Apache Beam Go SDK is the Beam Model implemented in the [Go Programming Language](https://go.dev/). 23 It is based on the following initial [design](https://s.apache.org/beam-go-sdk-design-rfc). 24 25 ## How to run the examples 26 27 **Prerequisites**: to use Google Cloud sources and sinks (default for 28 most examples), follow the setup 29 [here](https://beam.apache.org/documentation/runners/dataflow/). You can 30 verify that it works by running the corresponding Java example. 31 32 The examples are normal Go programs and are most easily run directly. 33 They are parameterized by Go flags. 34 For example, to run wordcount on the Go direct runner do: 35 36 ``` 37 $ pwd 38 [...]/sdks/go 39 $ go run examples/wordcount/wordcount.go --output=/tmp/result.txt 40 [{6: KV<string,int>/GW/KV<bytes,int[varintz]>}] 41 [{10: KV<int,string>/GW/KV<int[varintz],bytes>}] 42 2018/03/21 09:39:03 Pipeline: 43 2018/03/21 09:39:03 Nodes: {1: []uint8/GW/bytes} 44 {2: string/GW/bytes} 45 {3: string/GW/bytes} 46 {4: string/GW/bytes} 47 {5: string/GW/bytes} 48 {6: KV<string,int>/GW/KV<bytes,int[varintz]>} 49 {7: CoGBK<string,int>/GW/CoGBK<bytes,int[varintz]>} 50 {8: KV<string,int>/GW/KV<bytes,int[varintz]>} 51 {9: string/GW/bytes} 52 {10: KV<int,string>/GW/KV<int[varintz],bytes>} 53 {11: CoGBK<int,string>/GW/CoGBK<int[varintz],bytes>} 54 Edges: 1: Impulse [] -> [Out: []uint8 -> {1: []uint8/GW/bytes}] 55 2: ParDo [In(Main): []uint8 <- {1: []uint8/GW/bytes}] -> [Out: T -> {2: string/GW/bytes}] 56 3: ParDo [In(Main): string <- {2: string/GW/bytes}] -> [Out: string -> {3: string/GW/bytes}] 57 4: ParDo [In(Main): string <- {3: string/GW/bytes}] -> [Out: string -> {4: string/GW/bytes}] 58 5: ParDo [In(Main): string <- {4: string/GW/bytes}] -> [Out: string -> {5: string/GW/bytes}] 59 6: ParDo [In(Main): T <- {5: string/GW/bytes}] -> [Out: KV<T,int> -> {6: KV<string,int>/GW/KV<bytes,int[varintz]>}] 60 7: CoGBK [In(Main): KV<string,int> <- {6: KV<string,int>/GW/KV<bytes,int[varintz]>}] -> [Out: CoGBK<string,int> -> {7: CoGBK<string,int>/GW/CoGBK<bytes,int[varintz]>}] 61 8: Combine [In(Main): int <- {7: CoGBK<string,int>/GW/CoGBK<bytes,int[varintz]>}] -> [Out: KV<string,int> -> {8: KV<string,int>/GW/KV<bytes,int[varintz]>}] 62 9: ParDo [In(Main): KV<string,int> <- {8: KV<string,int>/GW/KV<bytes,int[varintz]>}] -> [Out: string -> {9: string/GW/bytes}] 63 10: ParDo [In(Main): T <- {9: string/GW/bytes}] -> [Out: KV<int,T> -> {10: KV<int,string>/GW/KV<int[varintz],bytes>}] 64 11: CoGBK [In(Main): KV<int,string> <- {10: KV<int,string>/GW/KV<int[varintz],bytes>}] -> [Out: CoGBK<int,string> -> {11: CoGBK<int,string>/GW/CoGBK<int[varintz],bytes>}] 65 12: ParDo [In(Main): CoGBK<int,string> <- {11: CoGBK<int,string>/GW/CoGBK<int[varintz],bytes>}] -> [] 66 2018/03/21 09:39:03 Reading from gs://apache-beam-samples/shakespeare/kinglear.txt 67 2018/03/21 09:39:04 Writing to /tmp/result.txt 68 ``` 69 70 The debugging output is currently quite verbose and likely to change. The output is a local 71 file in this case: 72 73 ``` 74 $ head /tmp/result.txt 75 while: 2 76 darkling: 1 77 rail'd: 1 78 ford: 1 79 bleed's: 1 80 hath: 52 81 Remain: 1 82 disclaim: 1 83 sentence: 1 84 purse: 6 85 ``` 86 87 To run wordcount on dataflow runner do: 88 89 ``` 90 $ go run wordcount.go --runner=dataflow --project=<YOUR_GCP_PROJECT> --region=<YOUR_GCP_REGION> --staging_location=<YOUR_GCS_LOCATION>/staging --worker_harness_container_image=<YOUR_SDK_HARNESS_IMAGE_LOCATION> --output=<YOUR_GCS_LOCATION>/output 91 ``` 92 93 The output is a GCS file in this case: 94 95 ``` 96 $ gsutil cat <YOUR_GCS_LOCATION>/output* | head 97 Blanket: 1 98 blot: 1 99 Kneeling: 3 100 cautions: 1 101 appears: 4 102 Deserved: 1 103 nettles: 1 104 OSWALD: 53 105 sport: 3 106 Crown'd: 1 107 ``` 108 109 110 See [BUILD.md](./BUILD.md) for how to build Go code in general. See 111 [container documentation](https://beam.apache.org/documentation/runtime/environments/#building-container-images) for how to build and push the Go SDK harness container image. 112 113 ## Issues 114 115 Please use the [`sdk-go`](https://github.com/apache/beam/issues?q=is%3Aopen+is%3Aissue+label%3Asdk-go) component for any bugs or feature requests. 116 117 ## Contributing to the Go SDK 118 119 ### New to developing Go? 120 https://tour.golang.org : The Go Tour gives you the basics of the language, interactively no installation required. 121 122 https://github.com/campoy/go-tooling-workshop is a great start on learning good (optional) development tools for Go. 123 124 ### Developing Go Beam SDK on Github 125 126 The Go SDK uses Go Modules for dependency management so it's as simple as cloning 127 the repo, making necessary changes and running tests. 128 129 Executing all unit tests for the SDK is possible from the `<beam root>\sdks\go` directory and running `go test ./...`. 130 131 To test your change as Jenkins would execute it from a PR, from the 132 beam root directory, run: 133 * `./gradlew :sdks:go:goTest` executes the unit tests. 134 * `./gradlew :sdks:go:test:ulrValidatesRunner` validates the SDK against the Portable Python runner. 135 * `./gradlew :sdks:go:test:flinkValidatesRunner` validates the SDK against the Flink runner. 136 137 Follow the [contribution guide](https://beam.apache.org/contribute/contribution-guide/#code) to create branches, and submit pull requests as normal. 138 139