Concepts

Multiple Passes

The parser is implemented as a multiple-pass parser, so the gained knowledge is deepened pass-by-pass. More over, the parser is implemented as a streaming-parser implemented with Python generators.

Passes:

  1. VHDL file pre-processing - needed for tool directives:

    • File decryption

    • Conditional analysis (new since VHDL-2019)

  2. Token generation

    • Slice a text file into character groups (tokens/words)

    • Preserve whitespace (space, tab, linebreak)

    • Preserved comments (single-/multi-line comments)

  3. Block generation

    • Assemble tokens in blocks (snippets of a statements) for faster document navigation

    • Exchange simple tokens (e.g. string token) with specific tokens (e.g. identifier or keyword)

  4. Group generation

    • Assemble blocks in groups (statements)

  5. Code-DOM generation

    • Consume a stream of groups to assemble the Code-DOM

    • Extract information from a group, their blocks or their specific tokens

  6. Comment annotation

    • Scan the data structure for comments and annotate comment to statements

  7. Build language model

    • Combine multiple Code-DOMs to form a full language model

  8. Build dependencies

    • Analyze order

    • Type hierarchy

    • Instance hierarchy

  9. Checkers

    • Check symbols (identifiers, types, …)

    • Check code style

    • Check documentation

  10. Statistics

    • Create statistics (SLoC, Comments vs. Code, …)

Object-Oriented Programming

Data Structures

All internal data structures are implemented as classes with fields (Python calls it attributes), methods, properties (getter and setters) and references (pointers) to other instances of classes.

All data is accompanied by its modification procedures in form of methods. New instances of a class can be created by calling the class and implicitly executing its initializer method __init__ or by calling a classmethod to help constructing that instance.

Inheritance

pyVHDLParser makes heavy use of inheritance to share implementations and to allow other classes or a user to modify the behavior of all derived classes by modifying a single source.

Multiple Inheritance (Mixins)

pyVHDLParser uses multiple inheritance via mixin classes. This allows e.g. an abstract definition of data models, which are later combined with a parser.

Properties

Instead of individual getter and setter methods, pyVHDLParser user Python properties.

Overwriting

Todo

Concepts -> OOP -> Overwriting

Overloading

Todo

Concepts -> OOP -> Overloading

Meta-Classes

Some additional behaviour can be easier implemented by modifying the class constructing other classes. Python calls this a meta-class. One prominent example is type.

Type Annotations

pyVHDLParser uses type annotations in method parameter definitions and in class field declarations to give hints in IDEs and documentation, what objects of which types are expected.

Double-Linked Lists

Data structures with direct references (pointers) in general and double linked lists in specific are approaches to implement fast and typed navigation from object to object. If a reference has multiple endpoints, it is either an order-preserving list or OrderedDict.

Many parts in pyVHDLParser form a chain of double-linked objects like tokens, blocks and groups. These object chains (or linked lists) can easily be iterated. Iterators can consume such linked lists and reemit the content in a modified way.

More over, such iterators can be packaged into Python generators.

Iterators and generators can be used in Python’s for [1] loops.

Python iterators

A Python iterable is an object implementing an __iter__ method returning an iterator. The iterator implements a __next__ method to return the next element in line. Usually, the iterator has some internal state, so it can compute the next element. At the end of an iteration, StopIteration is raised.

class Data:
  list : List = []

  class Iterator:
    obj :   Data = None
    value : Int =  None

    def __init__(self, obj):
      self.obj =   obj
      self.value = 1

    def __next__(self):
      x = self.value
      try:
        self.value += 1
        return obj.list[x]
      except KeyError:
        raise StopIteration

  def __iter__(self):
    return Iterator(self)

myData = Data()

for x in myData:
  print(x)

Python generators

A Python generator is a co-routine (function or method) that return execution flow from callee and in most cases with a return value to the caller. The state of the routine is preserved (e.g. local variables). When the execution in the co-routine is continued, it continues right after the yield statement.

It’s also possible to send parameters from caller to callee, when continuing the co-routines execution. (use send method.)

The generation of tokens, blocks and groups is implemented as a generator heavily using the yield statement.

Parallelism

Todo

Describe how to parallelize on multiple cores.

Token replacement

Todo

Describe why and how tokens are replaced. Describe why this is not corrupting data.

Classmethods as States

Todo

Describe why pyVHDLParser uses classmethods to represent parser states.

Parser State Machine

Todo

Describe how the parser works in pyVHDLParser.

Code-DOM

Todo

Describe what a Code-DOM is.

  • Clearly named classes that model the semantics of VHDL.

  • All language constructs (statements, declarations, specifications, …) have their own classes. These classes are arranged in a logical hierarchy, with a single common base-class.

  • Child objects shall have a reference to their parent.

  • Comments will be associated with a particular code object.

  • Easy modifications of the object tree.

  • Support formatting code objects as text for export and debugging.

  • Allow creating a CodeDOM from input file or via API calls.

  • Support resolving of symbolic references into direct references to other objects.


Footnotes: