github.com/grafana/pyroscope@v1.18.0/CLAUDE.md (about)

     1  # Pyroscope - AI Agent Development Guide
     2  
     3  This document provides context and guidance for AI coding assistants (Claude, Cursor, GitHub Copilot, etc.) working on the Pyroscope codebase.
     4  
     5  ## What is Pyroscope?
     6  
     7  Pyroscope is a horizontally scalable, highly available, multi-tenant continuous profiling aggregation system. 
     8  It's designed to store and query profiling data at scale, similar to how Prometheus works for metrics and Loki for logs.
     9  
    10  **Key Characteristics:**
    11  - Written in **Go**
    12  - Microservices-based architecture inspired by Cortex/Mimir/Loki
    13  - Stores profiling data in object storage (S3, GCS, Azure, etc.)
    14  - Multi-tenant by design
    15  
    16  ## Architecture Overview
    17  
    18  Pyroscope uses a **microservices architecture** where a single binary can run different components based on the `-target` parameter.
    19  
    20  ### V1 Components
    21  
    22  **Write Path:**
    23  - **Distributor**: Receives profile ingestion requests, validates, and forwards to ingesters
    24  - **Ingester**: Stores profiles in memory, periodically flushes to disk as blocks, periodically uploads blocks to long-term object storage
    25  - **Compactor**: Merges blocks and removes duplicates
    26  
    27  **Read Path:**
    28  - **Query Frontend**: Entry point for queries, handles query splitting and caching
    29  - **Query Scheduler**: Manages query queue and ensures fair execution across tenants
    30  - **Querier**: Executes queries by fetching data from ingesters and store-gateways
    31  - **Store Gateway**: Indexes and serves blocks from long-term object storage
    32  
    33  ### V2 Components
    34  
    35  **Write Path:**
    36  - **Distributor**: Receives profile ingestion requests, validates, and forwards to segment writers
    37  - **Segment Writer**: Writes block segments to long-term object storage and the block metadata to metastore
    38  - **Metastore**: Maintains an index for the block metadata and coordinates the block compaction process
    39  - **Compaction Worker**: Merges small segments into larger blocks
    40  
    41  **Read Path:**
    42  - **Query Frontend**: Entry point for queries, creates the query plan and executes it against query backends
    43  - **Query Backend**: Executes queries and merges query responses
    44  
    45  ### Storage
    46  
    47  - **Block Format**: Profiles stored in Parquet tables, series data in a TSDB index, symbols in a custom format
    48  - **Multi-tenant**: Each tenant has isolated storage
    49  - **Object Storage**: Primary storage backend (S3, GCS, Azure, local filesystem)
    50  
    51  ## Repository Structure
    52  
    53  ```
    54  .
    55  ├── cmd/
    56  │   ├── pyroscope/           # Main server binary
    57  │   └── profilecli/          # CLI tool for profile operations
    58  ├── pkg/                     # Core Go packages
    59  │   ├── distributor/         # Distributor component
    60  │   ├── ingester/            # Ingester component
    61  │   ├── querier/             # Querier component
    62  │   ├── frontend/            # Query frontend component
    63  │   ├── compactor/           # Compactor component
    64  │   ├── metastore/           # Metadata component
    65  │   ├── phlaredb/            # V1 database storage engine
    66  │   ├── model/               # Data models and types
    67  │   ├── objstore/            # Object storage abstraction
    68  │   ├── api/                 # API definitions and handlers
    69  │   └── og/                  # Legacy code (original Pyroscope)
    70  ├── public/app/              # React/TypeScript frontend
    71  ├── api/                     # API definitions (protobuf, OpenAPI)
    72  ├── docs/                    # Documentation
    73  ├── operations/              # Deployment configs (jsonnet, helm)
    74  ├── examples/                # Example applications and SDKs
    75  └── tools/                   # Development and build tools
    76  ```
    77  
    78  ## Tech Stack
    79  
    80  ### Backend
    81  - **Language**: Go 1.24+
    82  - **RPC**: gRPC with Connect protocol
    83  - **Storage**: Parquet, TSDB
    84  - **Hash Ring**: Consistent hashing with memberlist (gossip protocol)
    85  - **Observability**: Prometheus metrics, Structured logs, Distributed traces, pprof profiles
    86  
    87  ### Frontend
    88  - **Language**: TypeScript
    89  - **Framework**: React
    90  - **Build**: Webpack
    91  - **Styling**: Emotion (CSS-in-JS)
    92  - **State**: React hooks, Context API
    93  - **UI Library**: Grafana UI components
    94  
    95  ### Testing
    96  - **Go**: Standard `testing` package, testify for assertions
    97  - **Frontend**: Jest, React Testing Library, Cypress (e2e)
    98  
    99  ## Development Workflow
   100  
   101  ### Setup & Build
   102  
   103  ```bash
   104  # Install dependencies (Go 1.24+, Docker, Node v18, Yarn v1.22)
   105  # All other tools auto-download to .tmp/bin/
   106  
   107  # Build backend
   108  make go/bin
   109  
   110  # Run tests
   111  make go/test
   112  
   113  # Build frontend
   114  yarn install
   115  yarn dev          # Dev server on :4041
   116  
   117  # Run backend for frontend development
   118  yarn backend:dev  # Runs Pyroscope server
   119  
   120  # Docker image
   121  make GOOS=linux GOARCH=amd64 docker-image/pyroscope/build
   122  ```
   123  
   124  ### Code Generation
   125  
   126  **IMPORTANT**: After changing protobuf, configs, or flags:
   127  ```bash
   128  make generate
   129  ```
   130  Commit the generated files with your changes.
   131  
   132  ### Running Locally
   133  
   134  ```bash
   135  # Run all components in monolithic mode with embedded Grafana
   136  go run ./cmd/pyroscope --target all,embedded-grafana
   137  # Pyroscope: http://localhost:4040
   138  # Grafana: http://localhost:4041
   139  ```
   140  
   141  ## Code Style & Conventions
   142  
   143  ### Go Code
   144  
   145  1. **Imports**: Three groups separated by blank lines:
   146     ```go
   147     import (
   148         // Standard library
   149         "context"
   150         "fmt"
   151  
   152         // Third-party packages
   153         "github.com/prometheus/client_golang/prometheus"
   154         "go.uber.org/atomic"
   155  
   156         // Internal packages
   157         "github.com/grafana/pyroscope/pkg/model"
   158         "github.com/grafana/pyroscope/pkg/objstore"
   159     )
   160     ```
   161  
   162  2. **Formatting**: Use `golangci-lint` (run via `make lint`)
   163     - gofmt for formatting
   164     - goimports with `-local github.com/grafana/pyroscope`
   165  
   166  3. **Linting**:
   167     - Enabled: depguard, goconst, misspell, revive, unconvert, unparam
   168     - Use `github.com/go-kit/log` (NOT `github.com/go-kit/kit/log`)
   169  
   170  4. **Error Handling**:
   171     - Always check errors explicitly
   172     - Wrap errors with context: `fmt.Errorf("failed to query: %w", err)`
   173     - Use structured logging: `level.Error(logger).Log("msg", "failed to process", "err", err)`
   174  
   175  5. **Context**:
   176     - Always pass `context.Context` as the first parameter
   177     - Respect context cancellation in loops and long operations
   178  
   179  6. **Testing**:
   180     - File naming: `*_test.go`
   181     - Test function naming: `TestFunctionName` or `TestComponentName_Method`
   182     - Use table-driven tests for multiple cases
   183     - Prefer `t.Run()` for subtests
   184     - Use `require` for fatal assertions, `assert` for non-fatal
   185  
   186  ### TypeScript/React Code
   187  
   188  1. **File Extensions**: `.tsx` for components, `.ts` for utilities
   189  2. **Components**: Use functional components with hooks
   190  3. **Styling**: Use Emotion CSS-in-JS with Grafana UI theme
   191  4. **Props**: Define explicit TypeScript interfaces for all component props
   192  5. **Formatting**: Use Prettier (run via `yarn lint`)
   193  
   194  ## Common Patterns
   195  
   196  ### Multi-tenancy
   197  
   198  All requests must include a tenant ID in the `X-Scope-OrgID` header:
   199  
   200  ```go
   201  import "github.com/grafana/pyroscope/pkg/tenant"
   202  
   203  // Extract tenant ID from context
   204  tenantID, err := tenant.ExtractTenantIDFromContext(ctx)
   205  if err != nil {
   206      return err
   207  }
   208  ```
   209  
   210  ### Consistent Hashing
   211  
   212  Components use a hash ring for sharding:
   213  
   214  ```go
   215  // Get ingester for a given label set
   216  replicationSet, err := ring.Get(key, op, bufDescs, bufHosts, bufZones)
   217  ```
   218  
   219  ### Object Storage
   220  
   221  Abstract object storage operations:
   222  
   223  ```go
   224  import "github.com/grafana/pyroscope/pkg/objstore"
   225  
   226  // Use the Bucket interface
   227  bucket := objstore.NewBucket(cfg)
   228  reader, err := bucket.Get(ctx, "path/to/object")
   229  ```
   230  
   231  ### Configuration
   232  
   233  Use `github.com/grafana/dskit` for configuration:
   234  
   235  ```go
   236  type Config struct {
   237      ListenPort int `yaml:"listen_port"`
   238      // Use RegisterFlags pattern
   239  }
   240  
   241  func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
   242      f.IntVar(&cfg.ListenPort, "server.http-listen-port", 4040, "HTTP listen port")
   243  }
   244  ```
   245  
   246  Run `make generate` after changing config definitions, to regenerate docs.
   247  
   248  ## Testing Best Practices
   249  
   250  1. **Unit Tests**: Test individual functions/methods in isolation
   251  2. **Integration Tests**: Use build tags: `//go:build integration`
   252  3. **Mocking**: Use `mockery` for generating mocks from interfaces
   253  4. **Fixtures**: Store test data in `testdata/` directories
   254  5. **Parallel Tests**: Use `t.Parallel()` when tests are independent
   255  6. **Cleanup**: Always use `t.Cleanup()` for resource cleanup
   256  
   257  Example test:
   258  ```go
   259  func TestDistributor_Push(t *testing.T) {
   260      t.Parallel()
   261  
   262      tests := []struct {
   263          name    string
   264          input   *pushv1.PushRequest
   265          wantErr bool
   266      }{
   267          {name: "valid request", input: validRequest(), wantErr: false},
   268          {name: "invalid tenant", input: invalidRequest(), wantErr: true},
   269      }
   270  
   271      for _, tt := range tests {
   272          t.Run(tt.name, func(t *testing.T) {
   273              d := setupDistributor(t)
   274              err := d.Push(context.Background(), tt.input)
   275              if tt.wantErr {
   276                  require.Error(t, err)
   277              } else {
   278                  require.NoError(t, err)
   279              }
   280          })
   281      }
   282  }
   283  ```
   284  
   285  ## Common Pitfalls & Things to Avoid
   286  
   287  1. **Don't** introduce dependencies on `pkg/og/` - this is legacy code being phased out
   288  2. **Don't** use `github.com/go-kit/kit/log` - use `github.com/go-kit/log`
   289  3. **Don't** forget to run `make generate` after changing protobuf/config definitions
   290  4. **Don't** hardcode tenant IDs – always extract from context
   291  5. **Don't** create unbounded goroutines – use worker pools or semaphores
   292  6. **Don't** ignore context cancellation in loops
   293  7. **Don't** log PII or sensitive data
   294  8. **Don't** use `fmt.Println` for logging - use structured logging
   295  9. **Don't** add imports within the three import groups (keep them separate)
   296  10. **Don't** commit changes to `node_modules/` or generated code without source changes
   297  
   298  ## Security Considerations
   299  
   300  1. **Input Validation**: Always validate and sanitize user input
   301  2. **Path Traversal**: Validate object keys before storage operations
   302  3. **Rate Limiting**: Distributor implements per-tenant rate limiting
   303  4. **Authentication**: Multi-tenancy via `X-Scope-OrgID` header (authentication delegated to gateway)
   304  
   305  ## Performance Considerations
   306  
   307  1. **Profiling**: This is a profiling system – profile your own changes!
   308     ```bash
   309     go test -cpuprofile=cpu.prof -memprofile=mem.prof -bench=.
   310     go tool pprof cpu.prof
   311     ```
   312  
   313  2. **Allocations**: Minimize allocations in hot paths
   314     - Reuse buffers with `sync.Pool`
   315     - Avoid string concatenation in loops
   316     - Use `strings.Builder` for string building
   317  
   318  3. **Concurrency**:
   319     - Use worker pools for bounded concurrency
   320     - Prefer channels for coordination over mutexes when possible
   321     - Always consider the scalability implications
   322  
   323  ## Documentation
   324  
   325  - **User Docs**: `docs/sources/` - Published to grafana.com
   326  - **Contributing**: `docs/internal/contributing/README.md`
   327  - **Component Docs**: In `docs/sources/reference-pyroscope-architecture/components/`
   328  
   329  ## Useful Make Targets
   330  
   331  ```bash
   332  make help              # Show all available targets
   333  make lint              # Run linters
   334  make go/test           # Run Go unit tests
   335  make go/bin            # Build binaries
   336  make go/mod            # Tidy go modules
   337  make generate          # Generate code (protobuf, mocks, etc.)
   338  make docker-image/pyroscope/build  # Build Docker image
   339  ```
   340  
   341  ## Key Dependencies
   342  
   343  - **dskit**: Grafana's distributed systems toolkit (ring, services, middleware)
   344  - **connect**: RPC framework (gRPC-compatible)
   345  - **parquet-go**: Parquet file format implementation
   346  - **go-kit/log**: Structured logging
   347  - **prometheus/client_golang**: Metrics instrumentation
   348  - **opentracing-go**: Distributed tracing
   349  
   350  ## When Working on Features
   351  
   352  1. **Read Component Docs**: Check `docs/sources/reference-pyroscope-architecture/components/` for the component you're modifying
   353  2. **Understand the Ring**: If working on write/read path, understand consistent hashing
   354  3. **Multi-tenancy First**: Always consider multi-tenant implications
   355  4. **Check for Similar Code**: Pyroscope is inspired by Cortex/Mimir - similar patterns apply
   356  5. **Test Multi-tenancy**: Test with multiple tenants to catch isolation issues
   357  6. **Profile Your Changes**: Use `go test -bench` and verify performance impact
   358  7. **Update Documentation**: If changing user-facing behavior, update docs
   359  
   360  ## Getting Help
   361  
   362  - **Contributing Guide**: `docs/internal/contributing/README.md`
   363  - **Code Comments**: The codebase has extensive comments – read them
   364  - **Git History**: Use `git blame` and `git log` to understand design decisions
   365  
   366  ## Commit Guidelines
   367  
   368  - **Atomic Commits**: Each commit should be a logical unit
   369  - **Commit Messages**: Focus on "why" not just "what"
   370  - **Generated Code**: Include generated files in the same commit as source changes
   371  - **Format**: Follow existing commit message style (see `git log --oneline -20`)
   372  
   373  ## Additional Notes for AI Agents
   374  
   375  - **Favor Simplicity**: Pyroscope values simple, maintainable code over clever abstractions
   376  - **Performance Matters**: This system handles high-throughput profiling data
   377  - **Multi-tenancy is Critical**: Tenant isolation bugs are severe – test thoroughly
   378  - **Consistency with Grafana Labs Style**: Follow patterns from dskit, Mimir, Loki
   379  - **Ask Before Large Refactors**: Propose significant architectural changes before implementing
   380  
   381  ---
   382  
   383  For detailed setup and contributing instructions, see:
   384  - `docs/internal/contributing/README.md` - Development setup and workflow
   385  - `docs/sources/reference-pyroscope-architecture/` - System architecture deep dive