github.com/hashicorp/hcl/v2@v2.20.0/guide/language_design.rst (about)

     1  Configuration Language Design
     2  =============================
     3  
     4  In this section we will cover some conventions for HCL-based configuration
     5  languages that can help make them feel consistent with other HCL-based
     6  languages, and make the best use of HCL's building blocks.
     7  
     8  HCL's native and JSON syntaxes both define a mapping from input bytes to a
     9  higher-level information model. In designing a configuration language based on
    10  HCL, your building blocks are the components in that information model:
    11  blocks, arguments, and expressions.
    12  
    13  Each calling application of HCL, then, effectively defines its own language.
    14  Just as Atom and RSS are higher-level languages built on XML, HashiCorp
    15  Terraform has a higher-level language built on HCL, while HashiCorp Nomad has
    16  its own distinct language that is *also* built on HCL.
    17  
    18  From an end-user perspective, these are distinct languages but have a common
    19  underlying texture. Users of both are therefore likely to bring some
    20  expectations from one to the other, and so this section is an attempt to
    21  codify some of these shared expectations to reduce user surprise.
    22  
    23  These are subjective guidelines however, and so applications may choose to
    24  ignore them entirely or ignore them in certain specialized cases. An
    25  application providing a configuration language for a pre-existing system, for
    26  example, may choose to eschew the identifier naming conventions in this section
    27  in order to exactly match the existing names in that underlying system.
    28  
    29  Language Keywords and Identifiers
    30  ---------------------------------
    31  
    32  Much of the work in defining an HCL-based language is in selecting good names
    33  for arguments, block types, variables, and functions.
    34  
    35  The standard for naming in HCL is to use all-lowercase identifiers with
    36  underscores separating words, like ``service`` or ``io_mode``. HCL identifiers
    37  do allow uppercase letters and dashes, but this primarily for natural
    38  interfacing with external systems that may have other identifier conventions,
    39  and so these should generally be avoided for the identifiers native to your
    40  own language.
    41  
    42  The distinction between "keywords" and other identifiers is really just a
    43  convention. In your own language documentation, you may use the word "keyword"
    44  to refer to names that are presented as an intrinsic part of your language,
    45  such as important top-level block type names.
    46  
    47  Block type names are usually singular, since each block defines a single
    48  object. Use a plural block name only if the block is serving only as a
    49  namespacing container for a number of other objects. A block with a plural
    50  type name will generally contain only nested blocks, and no arguments of its
    51  own.
    52  
    53  Argument names are also singular unless they expect a collection value, in
    54  which case they should be plural. For example, ``name = "foo"`` but
    55  ``subnet_ids = ["abc", "123"]``.
    56  
    57  Function names will generally *not* use underscores and will instead just run
    58  words together, as is common in the C standard library. This is a result of
    59  the fact that several of the standard library functions offered in ``cty``
    60  (covered in a later section) have names that follow C library function names
    61  like ``substr``. This is not a strong rule, and applications that use longer
    62  names may choose to use underscores for them to improve readability.
    63  
    64  Blocks vs. Object Values
    65  ------------------------
    66  
    67  HCL blocks and argument values of object type have quite a similar appearance
    68  in the native syntax, and are identical in JSON syntax:
    69  
    70  .. code-block:: hcl
    71  
    72     block {
    73       foo = bar
    74     }
    75  
    76     # argument with object constructor expression
    77     argument = {
    78       foo = bar
    79     }
    80  
    81  In spite of this superficial similarity, there are some important differences
    82  between these two forms.
    83  
    84  The most significant difference is that a child block can contain nested blocks
    85  of its own, while an object constructor expression can define only attributes
    86  of the object it is creating.
    87  
    88  The user-facing model for blocks is that they generally form the more "rigid"
    89  structure of the language itself, while argument values can be more free-form.
    90  An application will generally define in its schema and documentation all of
    91  the arguments that are valid for a particular block type, while arguments
    92  accepting object constructors are more appropriate for situations where the
    93  arguments themselves are freely selected by the user, such as when the
    94  expression will be converted by the application to a map type.
    95  
    96  As a less contrived example, consider the ``resource`` block type in Terraform
    97  and its use with a particular resource type ``aws_instance``:
    98  
    99  .. code-block:: hcl
   100  
   101     resource "aws_instance" "example" {
   102       ami           = "ami-abc123"
   103       instance_type = "t2.micro"
   104  
   105       tags = {
   106         Name = "example instance"
   107       }
   108  
   109       ebs_block_device {
   110         device_name = "hda1"
   111         volume_size = 8
   112         volume_type = "standard"
   113       }
   114     }
   115  
   116  The top-level block type ``resource`` is fundamental to Terraform itself and
   117  so an obvious candidate for block syntax: it maps directly onto an object in
   118  Terraform's own domain model.
   119  
   120  Within this block we see a mixture of arguments and nested blocks, all defined
   121  as part of the schema of the ``aws_instance`` resource type. The ``tags``
   122  map here is specified as an argument because its keys are free-form, chosen
   123  by the user and mapped directly onto a map in the underlying system.
   124  ``ebs_block_device`` is specified as a nested block, because it is a separate
   125  domain object within the remote system and has a rigid schema of its own.
   126  
   127  As a special case, block syntax may sometimes be used with free-form keys if
   128  those keys each serve as a separate declaration of some first-class object
   129  in the language. For example, Terraform has a top-level block type ``locals``
   130  which behaves in this way:
   131  
   132  .. code-block:: hcl
   133  
   134     locals {
   135       instance_type = "t2.micro"
   136       instance_id   = aws_instance.example.id
   137     }
   138  
   139  Although the argument names in this block are arbitrarily selected by the
   140  user, each one defines a distinct top-level object. In other words, this
   141  approach is used to create a more ergonomic syntax for defining these simple
   142  single-expression objects, as a pragmatic alternative to more verbose and
   143  redundant declarations using blocks:
   144  
   145  .. code-block:: hcl
   146  
   147     local "instance_type" {
   148       value = "t2.micro"
   149     }
   150     local "instance_id" {
   151       value = aws_instance.example.id
   152     }
   153  
   154  The distinction between domain objects, language constructs and user data will
   155  always be subjective, so the final decision is up to you as the language
   156  designer.
   157  
   158  Standard Functions
   159  ------------------
   160  
   161  HCL itself does not define a common set of functions available in all HCL-based
   162  languages; the built-in language operators give a baseline of functionality
   163  that is always available, but applications are free to define functions as they
   164  see fit.
   165  
   166  With that said, there's a number of generally-useful functions that don't
   167  belong to the domain of any one application: string manipulation, sequence
   168  manipulation, date formatting, JSON serialization and parsing, etc.
   169  
   170  Given the general need such functions serve, it's helpful if a similar set of
   171  functions is available with compatible behavior across multiple HCL-based
   172  languages, assuming the language is for an application where function calls
   173  make sense at all.
   174  
   175  The Go implementation of HCL is built on an underlying type and function system
   176  :go:pkg:`cty`, whose usage was introduced in :ref:`go-expression-funcs`. That
   177  library also has a package of "standard library" functions which we encourage
   178  applications to offer with consistent names and compatible behavior, either by
   179  using the standard implementations directly or offering compatible
   180  implementations under the same name.
   181  
   182  The "standard" functions that new configuration formats should consider
   183  offering are:
   184  
   185  * ``abs(number)`` - returns the absolute (positive) value of the given number.
   186  * ``coalesce(vals...)`` - returns the value of the first argument that isn't null. Useful only in formats where null values may appear.
   187  * ``compact(vals...)`` - returns a new tuple with the non-null values given as arguments, preserving order.
   188  * ``concat(seqs...)`` - builds a tuple value by concatenating together all of the given sequence (list or tuple) arguments.
   189  * ``format(fmt, args...)`` - performs simple string formatting similar to the C library function ``printf``.
   190  * ``hasindex(coll, idx)`` - returns true if the given collection has the given index. ``coll`` may be of list, tuple, map, or object type.
   191  * ``int(number)`` - returns the integer component of the given number, rounding towards zero.
   192  * ``jsondecode(str)`` - interprets the given string as JSON format and return the corresponding decoded value.
   193  * ``jsonencode(val)`` - encodes the given value as a JSON string.
   194  * ``length(coll)`` - returns the length of the given collection.
   195  * ``lower(str)`` - converts the letters in the given string to lowercase, using Unicode case folding rules.
   196  * ``max(numbers...)`` - returns the highest of the given number values.
   197  * ``min(numbers...)`` - returns the lowest of the given number values.
   198  * ``sethas(set, val)`` - returns true only if the given set has the given value as an element.
   199  * ``setintersection(sets...)`` - returns the intersection of the given sets
   200  * ``setsubtract(set1, set2)`` - returns a set with the elements from ``set1`` that are not also in ``set2``.
   201  * ``setsymdiff(sets...)`` - returns the symmetric difference of the given sets.
   202  * ``setunion(sets...)`` - returns the union of the given sets.
   203  * ``strlen(str)`` - returns the length of the given string in Unicode grapheme clusters.
   204  * ``substr(str, offset, length)`` - returns a substring from the given string by splitting it between Unicode grapheme clusters.
   205  * ``timeadd(time, duration)`` - takes a timestamp in RFC3339 format and a possibly-negative duration given as a string like ``"1h"`` (for "one hour") and returns a new RFC3339 timestamp after adding the duration to the given timestamp.
   206  * ``upper(str)`` - converts the letters in the given string to uppercase, using Unicode case folding rules.
   207  
   208  Not all of these functions will make sense in all applications. For example, an
   209  application that doesn't use set types at all would have no reason to provide
   210  the set-manipulation functions here.
   211  
   212  Some languages will not provide functions at all, since they are primarily for
   213  assigning values to arguments and thus do not need nor want any custom
   214  computations of those values.
   215  
   216  Block Results as Expression Variables
   217  -------------------------------------
   218  
   219  In some applications, top-level blocks serve also as declarations of variables
   220  (or of attributes of object variables) available during expression evaluation,
   221  as discussed in :ref:`go-interdep-blocks`.
   222  
   223  In this case, it's most intuitive for the variables map in the evaluation
   224  context to contain an value named after each valid top-level block
   225  type and for these values to be object-typed or map-typed and reflect the
   226  structure implied by block type labels.
   227  
   228  For example, an application may have a top-level ``service`` block type
   229  used like this:
   230  
   231  .. code-block:: hcl
   232  
   233    service "http" "web_proxy" {
   234      listen_addr = "127.0.0.1:8080"
   235  
   236      process "main" {
   237        command = ["/usr/local/bin/awesome-app", "server"]
   238      }
   239  
   240      process "mgmt" {
   241        command = ["/usr/local/bin/awesome-app", "mgmt"]
   242      }
   243    }
   244  
   245  If the result of decoding this block were available for use in expressions
   246  elsewhere in configuration, the above convention would call for it to be
   247  available to expressions as an object at ``service.http.web_proxy``.
   248  
   249  If it the contents of the block itself that are offered to evaluation -- or
   250  a superset object *derived* from the block contents -- then the block arguments
   251  can map directly to object attributes, but it is up to the application to
   252  decide which value type is most appropriate for each block type, since this
   253  depends on how multiple blocks of the same type relate to one another, or if
   254  multiple blocks of that type are even allowed.
   255  
   256  In the above example, an application would probably expose the ``listen_addr``
   257  argument value as ``service.http.web_proxy.listen_addr``, and may choose to
   258  expose the ``process`` blocks as a map of objects using the labels as keys,
   259  which would allow an expression like
   260  ``service.http.web_proxy.service["main"].command``.
   261  
   262  If multiple blocks of a given type do not have a significant order relative to
   263  one another, as seems to be the case with these ``process`` blocks,
   264  representation as a map is often the most intuitive. If the ordering of the
   265  blocks *is* significant then a list may be more appropriate, allowing the use
   266  of HCL's "splat operators" for convenient access to child arguments. However,
   267  there is no one-size-fits-all solution here and language designers must
   268  instead consider the likely usage patterns of each value and select the
   269  value representation that best accommodates those patterns.
   270  
   271  Some applications may choose to offer variables with slightly different names
   272  than the top-level blocks in order to allow for more concise references, such
   273  as abbreviating ``service`` to ``svc`` in the above examples. This should be
   274  done with care since it may make the relationship between the two less obvious,
   275  but this may be a good tradeoff for names that are accessed frequently that
   276  might otherwise hurt the readability of expressions they are embedded in.
   277  Familiarity permits brevity.
   278  
   279  Many applications will not make blocks results available for use in other
   280  expressions at all, in which case they are free to select whichever variable
   281  names make sense for what is being exposed. For example, a format may make
   282  environment variable values available for use in expressions, and may do so
   283  either as top-level variables (if no other variables are needed) or as an
   284  object named ``env``, which can be used as in ``env.HOME``.
   285  
   286  Text Editor and IDE Integrations
   287  --------------------------------
   288  
   289  Since HCL defines only low-level syntax, a text editor or IDE integration for
   290  HCL itself can only really provide basic syntax highlighting.
   291  
   292  For non-trivial HCL-based languages, a more specialized editor integration may
   293  be warranted. For example, users writing configuration for HashiCorp Terraform
   294  must recall the argument names for numerous different provider plugins, and so
   295  auto-completion and documentation hovertips can be a great help, and
   296  configurations are commonly spread over multiple files making "Go to Definition"
   297  functionality useful. None of this functionality can be implemented generically
   298  for all HCL-based languages since it relies on knowledge of the structure of
   299  Terraform's own language.
   300  
   301  Writing such text editor integrations is out of the scope of this guide. The
   302  Go implementation of HCL does have some building blocks to help with this, but
   303  it will always be an application-specific effort.
   304  
   305  However, in order to *enable* such integrations, it is best to establish a
   306  conventional file extension *other than* `.hcl` for each non-trivial HCL-based
   307  language, thus allowing text editors to recognize it and enable the suitable
   308  integration. For example, Terraform requires ``.tf`` and ``.tf.json`` filenames
   309  for its main configuration, and the ``hcldec`` utility in the HCL repository
   310  accepts spec files that should conventionally be named with an ``.hcldec``
   311  extension.
   312  
   313  For simple languages that are unlikely to benefit from specific editor
   314  integrations, using the ``.hcl`` extension is fine and may cause an editor to
   315  enable basic syntax highlighting, absent any other deeper features. An editor
   316  extension for a specific HCL-based language should *not* match generically the
   317  ``.hcl`` extension, since this can cause confusing results for users
   318  attempting to write configuration files targeting other applications.