Integration Testing

Integration testing uses SynPUF data. The data dictionary for SynPUF data is located here. The S3 location for the SynPUF data is located at s3://novisci-test-data/cms-synpufs/data/.

ETL - asclepias

Setup

  1. Install node following the additional instructions for integration testing.

  2. Install the AWS CLI tools.

  3. Install the haskell toolchain.

Get ETL Outputs

  1. Clone the kafka-datapipe project, and open the project in your preferred IDE.

  2. In the terminal, run npm install

  3. In the terminal, run the following commands:

    ./plans/synpuf/test-parsers.sh
  4. Verify that synpuf-events.jsonl was created.

Verify ETL Outputs

TODO - add steps for verifying JSONSchema produced in event-data-model

Verify ETL Outputs in an asclepias project

  1. Clone the etl-integration project, and open the project in your preferred IDE.

  2. Open cabal.project.

  3. In cabal.project, if needed, update the tag for event-data-model.

    source-repository-package
       type: git
       location: https://cabal-project:xXv9gyJbBb3jzwo826Q9@gitlab.com/targetrwe/epistats/nsstat//event-data-model.git
       tag: {YOUR_TAG}
       subdir:
        fact-models
  4. In cabal.project, if needed, update the tag for asclepias.

    source-repository-package
      type: git
      location: https://cabal-project:J38uzQKemY1n17Jq_yb_@gitlab.com/targetrwe/epistats/nsstat/asclepias.git
      tag: {YOUR_TAG}
      subdir:
        hasklepias-core
        hasklepias-main
        event-data-theory
  5. Move the ETL output file, synpuf-events.jsonl, from the kafka-datapine repo to the plans/data folder in etl-integration rep.

  6. Update the integration tests if needed.

  7. Run cabal build all and resolve any issues.

  8. Run cabal test all and resolve any issues.

asclepias - R Project

R Project - CausalStudio