github.com/nya3jp/tast@v0.0.0-20230601000426-85c8e4d83a9b/docs/ARCHITECTURE.md (about) 1 # Tast Architecture Guide (go/tast-architecture-guide) 2 3 This document describes high-level architecture of Tast framework, and provides 4 guidance for future enhancements to the framework. 5 6 [TOC] 7 8 ## Introduction 9 10 Tast framework feature development is mostly about designing concepts. That is 11 because everything else in the framework, including APIs and internal 12 implementations, are all designed based on abstract concepts we define. 13 Well-designed concepts give us simple API users can easily understand, and 14 maintainable internal implementations. Bad concepts lead to user confusion and 15 maintenance nightmare. 16 17 This document was written to help you understand the current architecture of 18 Tast, and design new framework features. 19 20 This document first explains Tast's overall architecture and important existing 21 concepts. Next, it provides high-level guidance for future enhancements, citing 22 many examples of good/bad decisions we have made in the past. Finally, it 23 mentions several best practices we learned from framework development. 24 25 ## Background 26 27 ### Remote end-to-end testing 28 29 Tast is a remote end-to-end testing framework, primarily targeting ChromeOS. 30 31 There are two important aspects of Tast here: **end-to-end **and** remote**. 32 - **End-to-end**: Tast runs tests against a complete **target product**. Tests 33 run in a Linux process independent from the target product, so they interact 34 with the target product by simulating user inputs (e.g. generating keyboard 35 events), calling into test APIs provided by the target product (e.g. Chrome 36 DevTools protocol), etc. 37 - **Remote**: Tast involves two types of machines: a **host system** and one or 38 more **target systems**. Tast tests are initiated from a host system, and 39 exercise target products running on target systems remotely. Tast requires 40 that target systems are reachable via SSH. Tast tests may use other extra 41 means to interact with target systems, for example peripherals attached to 42 a target device physically. 43 44  45 46 ### Two types of Tast users 47 48 Tast has two types of users: 49 50 - **Test authors** who use Tast to write test scenarios. Test authors include 51 not only authors of individual tests but also authors of support libraries 52 used by multiple tests. Tast provides **Go APIs** to test authors which allows 53 them to register their tests to the framework and access resources needed to 54 perform test scenarios etc. 55 - **Test requesters** who use Tast to run test scenarios. Continuous integration 56 systems configured to run Tast tests automatically are the most significant 57 test requesters. Also, Test authors are considered test requesters since they 58 need to run work-in-progress tests to ensure they're correct. Tast provides 59 a **CLI command** to test requesters which allows them to run Tast tests and 60 consume their results. 61 62 Tast stands between these two types of users. It is important to know that they 63 have different, or sometimes even conflicting, needs to Tast. 64 65 ## Current architecture 66 67 This chapter describes the architecture of Tast framework as of writing. 68 69 ### High-level structure 70 71 At a high level, Tast-related components can be largely categorized into two: 72 **framework** and **user code**. 73 74 - **Framework** is the engine that executes tests defined in user code. 75 Framework code resides in the [chromiumos/platform/tast] repository. 76 Test authors rarely make changes to the framework. This document primarily 77 discusses the design of the framework. 78 - **User code** is a bunch of code written by test authors. User code resides 79 mainly in the [chromiumos/platform/tast-tests] repository, but there are 80 several other repositories such as [chromeos/platform/tast-tests-private]. 81 82 User code can be further categorized into two subcategories: 83 84 - **Support libraries** are a collection of common libraries used by tests. 85 - **Tests** are actual test cases written by test authors. 86 87 The framework provides following APIs to users: 88 89 - Test authors: **Go APIs** to interact with the framework, including: 90 - Registering entities (e.g. tests) to the framework 91 - Defining a test bundle 92 - Some basic libraries shared with the framework (e.g. SSH) 93 - Test requesters: **CLI command ("Tast CLI")** to work with tests, providing: 94 - Command line flags and parameters to specify execution configuration 95 - Stable protocols to report test results 96 97 The next diagram illustrates the relationship of those layers. 98 99  100 101 [chromiumos/platform/tast]: https://chromium.googlesource.com/chromiumos/platform/tast 102 [chromiumos/platform/tast-tests]:https://chromium.googlesource.com/chromiumos/platform/tast-tests 103 [chromeos/platform/tast-tests-private]: https://chrome-internal.googlesource.com/chromeos/platform/tast-tests-private 104 105 ### Concepts 106 107 #### Tests 108 109 A **test** is a unit of test scenarios defined by test authors. 110 111 A test is defined in a .go file. We call such files as **test files**. A test 112 file must define these two functions: 113 114 1. **Test registration**: An init() function that registers a test to the 115 framework on initialization. 116 2. **Test function**: An exported function that implements a test scenario. 117 118 Here is a complete example of a test file defining a no-op test: 119 120 ```go 121 // File: src/go.chromium.org/tast-tests/cros/local/bundles/cros/example/pass.go 122 123 package example 124 125 import ( 126 "context" 127 128 "go.chromium.org/tast/core/testing" 129 ) 130 131 func init() { 132 testing.AddTest(&testing.Test{ 133 Func: Pass, 134 Desc: "Always passes", 135 Contacts: []string{"nya@chromium.org", "tast-owners@google.com"}, 136 Attr: []string{"group:mainline"}, 137 }) 138 } 139 140 func Pass(ctx context.Context, s *testing.State) {} 141 ``` 142 143 **Test metadata** is represented by a testing.Test struct passed to 144 testing.AddTest on registration. Test metadata includes, but not limited to, 145 following fields: 146 147 - Func: A test function 148 - Desc: Human-readable description of a test 149 - Contacts: Contact emails 150 - Attr: Attributes assigned to a test 151 - Data: Data files needed by a test 152 - SoftwareDeps/HardwareDeps: Dependencies required to run a test 153 - VarDeps: Runtime variables needed by a test 154 - ServiceDeps: Services needed by a test 155 156 Note that test names are not included in test metadata. A test name is 157 automatically derived by joining a package name and a test function name with a 158 period. In the example above, the name of the test is "example.Pass". 159 160 When a test requester executes Tast CLI to run a test, the framework calls its 161 test function, passing a context.Context and a testing.State, with which it can 162 access resources needed for test scenario execution. 163 164 One of the most important things a test function does is to **report test 165 errors**. A test is considered **failed** if it reports one or more errors. 166 Otherwise, a test is considered **passed**. Once the framework starts a test, 167 its result is either passed or failed. If a test cannot be run on certain 168 conditions, it should describe the constraints as software/hardware dependencies 169 so that the framework skips it without executing it. 170 171 A test can save output files for post-run inspection. testing.State.OutDir and 172 testing.ContextOutDir returns a directory where a test should place output 173 files. 174 175 *** note 176 **Note**: Additional restrictions on test files 177 178 For better consistency and readability, we have a lint checker which enforces 179 various rules on how to define a test. Here are some notable rules: 180 181 - A test file must define exactly one test. It is prohibited to define two or 182 more tests in a file. 183 - A test file name must match with the base name of a test. For example, a test 184 named "pkg.FooBar" must be defined in pkg/foo_bar.go. 185 - A test file must not define any exported symbols but a test function. It can 186 still optionally define other unexported symbols (constants, variables, 187 functions...) that are used by the test. 188 189 These rules are designed to make it very easy to find a test file from a test name. 190 *** 191 192 #### Local/remote 193 194 There are two types of tests: **local tests** and **remote tests**. Local tests 195 are executed in a process running on the target system, while remote tests are 196 executed in a process running on the host system. 197 198 Unless remote test functionalities are needed, a test is better to be written as 199 a local test. Local tests are much easier to interact with the target system as 200 it gets direct access to the resources on the target system (e.g. local file 201 system, network sockets, system calls). 202 203 There are several cases where remote tests are needed. One of the most popular 204 cases is a test rebooting the target system. Such a test cannot be written as a 205 local test since a reboot would terminate a testing process itself. Other 206 possible cases where remote tests are needed are: a test temporarily detaching 207 the target system from the network, a test controlling the target system via 208 peripherals attached to it, or a test interacting with multiple target systems. 209 210 Fixtures can also be local and remote. See [Fixtures](#fixtures) for details. 211 212 #### Services 213 214 A **service** is a user-defined gRPC service that can be run on the target 215 system to be called from remote tests and fixtures. 216 217 Remote tests have access to the target device via SSH, with which they can 218 interact with the target device theoretically. However it is only capable of 219 running external commands on the target system, so it's not enough to perform 220 complicated test scenarios. Instead users can implement services and call them 221 from remote tests by gRPC. Then remote tests can call into support libraries 222 built for local tests. 223 224 Services are registered to the framework in a very similar way as tests. 225 A service is defined in a .go file called **a service file**, containing the 226 following two symbols: 227 228 1. **Service registration**: An init() function that registers a service to the 229 framework on initialization. 230 2. **Service implementation**: An exported type that implements a gRPC service. 231 232 Here is an example of a service file: 233 234 ```go 235 // File: src/go.chromium.org/tast-tests/cros/local/bundles/cros/example/foo_service.go 236 237 func init() { 238 testing.AddService(&testing.Service{ 239 Register: func(srv *grpc.Server, s *testing.ServiceState) { 240 example.RegisterFooServiceServer(srv, &FooService{s: s}) 241 }, 242 }) 243 } 244 245 type FooService struct { 246 s *testing.ServiceState 247 } 248 249 func (s *FooService) Bar(ctx context.Context, req *example.BarRequest) (*example.BarResponse, error) { 250 ... 251 } 252 ``` 253 254 **Service metadata** is represented by a testing.Service struct passed to 255 testing.AddService on registration. 256 257 Service methods have access to several features similar to tests. For example, 258 they can call testing.ContextLog to emit logs, and testing.ContextOutDir to save 259 output files. Those functions behave as if they're called in the remote test 260 calling into the current gRPC method. 261 262 Users have to declare in remote test metadata which services a remote test may 263 call into. This is required to build [entity graphs](#entities) before 264 execution. 265 266 #### Fixtures 267 268 A **fixture** sets up and maintains an **environment** to be shared by tests and 269 other fixtures. 270 271 An environment is an abstract term referring to a state of the target/host 272 system. Some possible environments a fixture may set up are, for example: 273 274 - The target ChromeOS device is in the login screen 275 - The target ChromeOS device is logged into a user session 276 - The target ChromeOS device is logged into a user session, and Crostini is 277 enabled 278 - The target ChromeOS device is enrolled into an enterprise policy 279 280 Fixtures are registered to the framework in a very similar way as tests and 281 services. Fixture registration is done by an init function calling 282 testing.AddFixture with a testing.Fixture struct, representing fixture metadata. 283 284 Here is an example minimum fixture definition: 285 286 ```go 287 func init() { 288 testing.AddFixture(&testing.Fixture{ 289 Name: "someFixture", 290 Impl: &someFixture{}, 291 }) 292 } 293 294 type someFixture struct{} 295 296 func (*someFixture) SetUp(ctx context.Context, s *testing.FixtState) interface{} { return nil } 297 func (*someFixture) TearDown(ctx context.Context, s *testing.FixtState) {} 298 func (*someFixture) PreTest(ctx context.Context, s *testing.FixtTestState) {} 299 func (*someFixture) PostTest(ctx context.Context, s *testing.FixtTestState) {} 300 func (*someFixture) Reset(ctx context.Context) error { return nil } 301 ``` 302 303 A test can optionally **depend on** a fixture by declaring a dependency in its 304 metadata. If a test depends on a fixture, the fixture is used to set up a 305 desired environment before the test starts. 306 307 To let a fixture provide a consistent environment to tests, the framework calls 308 into a fixture's various **lifecycle methods**. There are 5 lifecycle methods: 309 310 - SetUp 311 - TearDown 312 - PreTest 313 - PostTest 314 - Reset 315 316 SetUp/TearDown are called when a fixture needs to set up / tear down an 317 environment. PreTest/PostTest are called before/after a test depending on the 318 fixture runs. 319 320 SetUp may return a **fixture value**, an arbitrary value that is made available 321 to its dependants. A fixture value value is typically used to pass in-memory 322 objects and/or information related to the environment the fixture has set up. 323 324 **Reset** is a unique lifecycle method called between tests depending on the 325 fixture. In Reset, a fixture should perform a light-weight reset of the current 326 environment to one acceptable by the fixture. If it fails to do so, it should 327 return an error, which in turn causes the framework to tear down the fixture and 328 set it up again before the next test. If Reset succeeds, the framework proceeds 329 to run the next test without tearing down the fixture. This lifecycle event 330 allows fixtures to efficiently recover from side effects tests left to the 331 environment. 332 333 For example, let us think of a fixture that provides an environment where 334 "logged into a Chrome user session and all windows are closed". This fixture's 335 lifecycle methods can be implemented in the following way: 336 337 - SetUp: Restart UI and log into a new Chrome user session, and return a Chrome 338 connection object as a fixture value 339 - TearDown: Logout from a session and close the connection object 340 - PreTest/PostTest: Do nothing 341 - Reset: Check that the Chrome process is intact, and close all open windows 342 343 A fixture is useful for multiple tests to share an environment whose set up is 344 costly. For example, let us think of 10 tests needing to run test scenarios in 345 a Chrome user session. Without fixtures, each test needs to perform a login to 346 a new user session at their beginning since they don't know the current state of 347 the target system when they start. This is not only inefficient, but also can 348 elevate the risk of test flakiness as they repeat the same login operations. 349 This problem can be solved by introducing a fixture that logs into a new user 350 session, and letting 10 tests depend on the fixture. Then, when one or more 351 tests in the 10 tests are requested to run, the fixture is executed in advance 352 to log into a new user session, and tests run their test scenarios without 353 needing to repeat logins. 354 355 So far we explained the most basic use of fixtures. But fixtures are a powerful 356 mechanism with the following features: 357 358 - A fixture can be local or remote. Local fixtures run on the target system, 359 while remote fixtures run on the host system. 360 - Fixtures are **composable**: a fixture can also optionally depend on another 361 fixture. A fixture cannot depend on itself directly or indirectly. 362 - Furthermore, **local tests/fixtures can depend on remote fixtures**. This 363 allows writing local tests that interact with the target system remotely. 364 365 See the design doc of fixtures for more information. 366 367 ### Preconditions (deprecated) 368 369 Preconditions are a predecessor of fixtures. Preconditions tried to solve the 370 same problem in a limited way; they are not composable and have leaky boundaries 371 with tests. 372 373 ### Entities 374 375 An **entity** is a collective term of items registered to the framework with 376 metadata on initialization, and called back by the framework as needed. Today, 377 **tests, fixtures, and services** are entities. 378 379 An entity can declare dependencies to other entities in its metadata. The 380 diagram below indicates which entity can depend on which entity, and which 381 metadata field declares them. 382 383  384 385 When a test/fixture does not depend on a fixture explicitly, the framework 386 treats it internally as implicitly depending on the **virtual root fixture**. 387 388 An **entity graph** is a graph having tests and fixtures as nodes and fixture 389 dependencies as edges. **An entity graph forms a directed tree** whose root is 390 the virtual root fixture. The below diagram illustrates an example entity graph. 391 392  393 394 The most important property of an entity graph is that it can be statically 395 computed from entity metadata. This property allows the framework to compute all 396 entities relevant to tests requested to run before actually executing them by 397 traversing an entity graph from test nodes. 398 399 *** note 400 **Note**: Extended entity graph 401 402 Entity graphs do not contain services. We can define an extended entity graph 403 containing tests, fixtures and services. An extended entity graph is not a tree 404 but a directed acyclic graph (DAG). 405 *** 406 407 #### Test bundles 408 409 A **test bundle** is a Go executable file built by linking user-defined entities 410 and their dependencies. 411 412 A test bundle can be local or remote. Local test bundles should link local 413 entities only, and vice versa. A local test bundle and a remote test bundle with 414 the same name are grouped; entities in the same group may interact, e.g. a local 415 test depending on a remote fixture, or a remote test depending on a service. 416 417 A test bundle's main.go is typically a small file that anonymously imports 418 packages where entities are defined, and defines a main function that calls into 419 a framework entry point function. Below is an example main file of a local test 420 bundle: 421 422 ```go 423 package main 424 425 import ( 426 "os" 427 428 "go.chromium.org/tast/core/bundle" 429 430 // Underscore-imported packages register their tests via init functions. 431 _ "go.chromium.org/tast-tests/cros/local/bundles/cros/apps" 432 _ "go.chromium.org/tast-tests/cros/local/bundles/cros/arc" 433 ... 434 ) 435 436 func main() { 437 os.Exit(bundle.LocalDefault(bundle.Delegate{})) 438 } 439 ``` 440 441 bundle.LocalDefault/RemoteDefault accepts a bundle.Delegate struct which 442 specifies various hooks to be called by the framework. A run hook is called 443 before/after a test bundle is executed. A test hook is called before/after 444 a test is executed. 445 446 There are a few reasons to create a new test bundle. The first and foremost one 447 is ACL: if you want to make several tests public while keeping other tests 448 private, you need to create two test bundles, one for public tests and the other 449 one for private tests, so that external users who cannot check out private 450 source code can still build the public test bundle. Also, it would be useful to 451 create a new test bundle for a new target system (e.g. non ChromeOS target 452 systems) since it can install a different set of hooks. 453 454 As of writing, we have only two test bundles today: "cros" for public ChromeOS 455 tests and "crosint" for private ChromeOS tests. Since the two test bundles 456 share the same set of bundle.Delegate parameters, their main functions call into 457 the bundlemain support package, which in turn calls into 458 bundle.LocalDefault/RemoteDefault, to avoid duplication. 459 460 ### Executables 461 462 Tast test execution involves three types of executables: 463 464 - **Tast CLI** a.k.a "tast" command. This is an executable installed to the host 465 system in prior. Test requesters run this command, and it communicates with 466 other executables to run tests. In local development environment, Tast CLI 467 also invokes Go toolchains to build other executables (aka -build=true mode). 468 - **Local/remote test runner**. There are exactly two executables: 469 "local_test_runner" installed onto the target system, and "remote_test_runner" 470 installed onto the host system. They are built from solely the framework code 471 and don't include any user-defined code. Tast CLI calls them to perform 472 operations not specific to test bundles, and to run test bundles. 473 - **Local/remote test bundles**. As described above, they are executable 474 containing user-defined entities. 475 476  477 478 ## Guidance for future enhancements 479 480 This chapter gives guidance on framework enhancements in the future. We start 481 from higher-level principles and then go down to more detailed best practices. 482 483 ### Key design principle 484 485 There are many design general principles for software design, and most of them 486 are useful for Tast framework design. That said, one of the most important 487 design principles I found useful specific to Tast is: 488 489 **A good framework provides a small number of orthogonal features that cover 490 a large number of use cases.** 491 492 It is obvious that covering more use cases is better. On the other hand, it is 493 good to minimize the number of features because, the more features the framework 494 provides, the more complexity it gets due to interaction between the features. 495 496 ### Considerations on designing a new feature 497 498 #### Do you really need the feature in the framework? 499 500 On evaluating a feature request, first ask yourself if you really need it in the 501 framework. 502 503 As described in the key design principle, we want to minimize the number of 504 features the framework provides. It's best if we could support use cases without 505 adding new features to the framework. Check if the feature can be implemented in 506 support libraries or with existing framework features. If we really need 507 a feature in the framework, do your best to design it to cover as many use cases 508 as possible. 509 510 When a proposed feature is useful only for certain use cases, it may mean that 511 the design is too specific to those use cases. In such cases, it often helps to 512 punt the feature until we learn more use cases and better generalize 513 requirements. If feature requests are high priority, consider implementing the 514 feature in support libraries, even if they look unclean and/or end up in more 515 boilerplates. 516 517 *** aside 518 **Example**: Faillog 519 520 Tast has a mechanism called faillog to capture logs such as screenshots on test 521 failures. We initially implemented faillog as a support library 522 ([crbug.com/856540](https://crbug.com/856540)) since we were not sure if the 523 feature is useful for all tests. Faillog as a support library was not optimal 524 as tests interested in faillog should have been modified slightly to opt-in. 525 After some experiments, faillog turned out to be useful for most tests, so we 526 merged the mechanism to the framework 527 ([crbug.com/882729](https://crbug.com/882729)). 528 *** 529 530 *** aside 531 **Example**: Screenshot tests 532 533 A proposal to extend the Tast control protocol was made for screenshot tests 534 ([crrev.com/c/2422101](https://crrev.com/c/2422101)). After checking the 535 requirements, it turned out that they just wanted to run executables available 536 only on the host, so writing remote tests was sufficient. 537 *** 538 539 *** aside 540 **Example**: -skipsort for MTBF tests 541 542 A proposal was made by MTBF test authors to add a new flag -skipsort to Tast CLI 543 ([crrev.com/c/2429242](https://crrev.com/c/2429242)). The flag was meant to 544 disable Tast's internal test reordering and run tests in the exact order as 545 specified in command line arguments. 546 547 Supporting this feature was technically possible. However, there were no other 548 use cases needing this feature, and also the feature was expected to introduce 549 a lot of complexity to the framework. After discussion with relevant teams, we 550 agreed not to implement this feature. 551 *** 552 553 *** aside 554 **Example**: Uploading crash dumps 555 556 A proposal was made to upload crash dumps generated during tests to Google 557 servers automatically ([crrev.com/c/2337754](https://crrev.com/c/2337754)). 558 The approach had a privacy implication since Tast has many users outside of 559 Google. In the end, the feature was implemented in the ChromeOS testing 560 infrastructure. 561 *** 562 563 #### Interaction with other features 564 565 Think carefully how a new feature interacts with other existing features. 566 567 Enumerating interactions with existing features is a difficult task as you need 568 understanding of all existing features in the framework. If you're unsure, you 569 may want to try creating a proof-of-concept implementation of the feature, which 570 can uncover some interactions you couldn't imagine in advance. 571 572 *** aside 573 **Example**: ContextSoftwareDeps 574 575 testing.ContextSoftwareDeps is a function that returns a list of software 576 dependencies declared by the current test. This function was introduced to 577 ensure in certain support libraries that a calling test declares correct 578 software dependencies. An example is that chrome.New calls this function to 579 ensure the current test declares the "chrome" software dependency 580 ([crbug.com/954435](https://crbug.com/954435)). 581 582 Introduction of fixtures made the function less useful since there is no 583 "current test" when executing fixtures. The function is planned to be deleted 584 ([crbug.com/1135996](https://crbug.com/1135996)). 585 586 As you see from this example, you should be careful when a feature works with 587 "the current test". 588 *** 589 590 *** aside 591 **Example**: Direct test execution with local_test_runner 592 593 Usually Tast tests are initiated by Tast CLI installed on the host system. 594 However, local_test_runner installed on the target system can be directly 595 executed by test requesters to run local tests directly. This feature was 596 implemented in the very early days of Tast. 597 598 Currently direct test execution with local_test_runner is deprecated since we 599 got several features that cannot be supported without a host system. For 600 example, local tests directly executed by local_test_runner cannot access secret 601 runtime variables as they're only installed on the host system. Also, 602 local_test_runner cannot execute local tests depending on remote fixtures. 603 *** 604 605 #### Beware of versioning boundaries 606 607 Many CI systems deploy Tast for end-to-end testing, including ChromeOS, Chrome, 608 Android, Google3, and several other CI systems outside of Google. This means 609 that it is very difficult to make changes to the protocol between Tast and CI 610 systems, e.g. adding/removing/changing Tast CLI flags or changing test result 611 directory structure, since you cannot make atomic commits to Tast and all those 612 CI systems. 613 614 In general, we should be extremely careful about designing a new Tast CLI 615 feature for test requesters since it is difficult to make breaking changes. 616 As for Go APIs for test authors, we can be less strict as we can make atomic 617 commits to the framework and user code as of writing. However, once we start 618 having Tast tests outside of ChromeOS repositories, Go API stability will 619 become important. 620 621 *** aside 622 **Example**: Introducing group:mainline 623 624 In the early days of Tast, we had only three classifications of a test: 625 critical, informational, and disabled. The classification rule was simple: 626 a test is, 627 - disabled if it has the "disabled" attribute, 628 - informational if it has the "informational" attribute, 629 - critical otherwise. 630 631 After introducing non-functional tests (e.g. performance tests), we introduced 632 test group attributes. In the new rule, a test needed the "group:mainline" 633 attribute to be considered as critical/informational. To disable a test, simply 634 the "group:mainline" attribute could be removed. 635 636 Migration from the old rule to the new rule turned out to be very painful 637 because those rules have been hard-coded to several CI systems (ChromeOS, 638 Chrome, Android at that time) as attribute expressions. Therefore we needed to 639 do step-by-step migration as described in 640 [go/tast-mainline-attr-transition](https://goto.google.com/tast-mainline-attr-transition). 641 *** 642 643 *** aside 644 **Example**: Test selection by software dependencies 645 646 We had a bug where ARC-related tests were run in an unexpected way 647 ([crbug.com/992303](https://crbug.com/992303)). The root cause was that 648 ARC-related software dependency names were renamed 649 (e.g. "android" -> "android_p") while we continued to use the old software 650 dependency name to select ARC-related tests ("dep:android"). 651 652 A lesson learned is that user-defined test metadata should not be directly 653 referenceable in attribute expressions. This is a problem we have to resolve 654 in the future. 655 *** 656 657 *** aside 658 **Example**: Using new Tast CLI flags 659 660 We had a bug that Tast CLI fails to run because of unsupported flags on release 661 branches ([b/191779650](https://issuetracker.google.com/issues/191779650)). 662 It was because a new flag was added to Tast CLI but ChromeOS CI used an 663 unbranched config to specify a list of flags to pass to Tast CLI. 664 665 We think that this is a design bug in ChromeOS CI configuration: unbranched CI 666 configs should not construct Tast CLI flags that can change per branch. We 667 expect that this problem is solved in the future. 668 *** 669 670 #### Beware of ChromeOS specific logic 671 672 Tast framework should focus on being a general remote testing framework, and 673 should be agnostic to the target/host system type. 674 675 Tast started as a testing framework for ChromeOS, so naturally it has several 676 hard-coded logic that assume that the target system is ChromeOS and the host 677 system is ChromeOS chroot. But we expect that Tast will be used outside of 678 ChromeOS in near future. Therefore it is good to avoid introducing new 679 ChromeOS specific logic to the framework, and remove existing ChromeOS 680 specific logic from the framework. 681 682 If you need ChromeOS specific logic, consider if you can put them in test 683 bundles or CI systems. If it's impossible, introduce a proper boundary between 684 the new ChromeOS specific logic and existing OS agnostic logic. 685 686 *** aside 687 **Example**: Test hooks 688 689 A proposal was made to the framework to run the auditctl command between tests 690 for debugging certain failures 691 ([crrev.com/c/2513678](https://crrev.com/c/2513678)). Since this logic was 692 specific to ChromeOS, we introduced test hooks to test bundles and asked the 693 author to put the logic there. 694 *** 695 696 *** aside 697 **Example**: ChromeOS infra specific APIs 698 699 For next-gen ChromeOS infra support, we added to the framework the logic to 700 resolve the target hostname and port with ChromeOS infra specific APIs. This 701 design turned out bad, and we're moving the logic out of the framework. 702 *** 703 704 *** aside 705 **Example**: Downloading external data files 706 707 In ChromeOS lab, when downloading data files from Google Cloud Storage, test 708 frameworks (not limited to Tast) are supposed to use Devservers, which act as 709 a sort of caching proxy server to Google Cloud Storage with private credentials. 710 Tast uses Devservers to download external data files needed by tests. 711 712 Test frameworks and Devservers use non-standard REST APIs to communicate. 713 Today, many non-ChromeOS infra run Tast tests, but it is only ChromeOS infra 714 that provides Devservers to Tast to allow downloading ACL'ed external data 715 files. 716 717 In near future we should replace Devserver protocol support in Tast. 718 ***