<!-- Do not edit this file. It is automatically generated by API Documenter. -->

[Home](./index.md) &gt; [@datashaper/schema](./schema.md)

## schema package

## Enumerations

|  Enumeration | Description |
|  --- | --- |
|  [BinStrategy](./schema.binstrategy.md) | Describes the binning technique to use. See numpy for detailed definitions: https://numpy.org/doc/stable/reference/generated/numpy.histogram\_bin\_edges.html |
|  [BooleanComparisonOperator](./schema.booleancomparisonoperator.md) |  |
|  [BooleanOperator](./schema.booleanoperator.md) |  |
|  [CodebookStrategy](./schema.codebookstrategy.md) |  |
|  [DataFormat](./schema.dataformat.md) | Base format the data is stored within. This will expand to include additional formats such as Arrow and Parquet over time. TODO: we've seen a number of examples in the wild using JSON Lines https://jsonlines.org/ |
|  [DataNature](./schema.datanature.md) | Indicates the expected general layout of the data. This could be used to provide validation hints. For example, microdata must have one row per subject. TODO: "timeseries" as distinct from "panel"? others? |
|  [DataOrientation](./schema.dataorientation.md) | <p>Indicates the orientation of the data within the file.</p><p>Most CSV data files are 'values' (row-oriented).</p><p>JSON files can commonly be either. Records are probably more common, though require more space due to replication of keys. Apache Arrow or Parquet are columnar. This nearly aligns with pandas: https://pandas.pydata.org/pandas-docs/stable/user\_guide/io.html\#json</p><p>A key difference (which probably needs resolved) is that we don't yet support the notion of an index. See their example for "columns" or "index" orientation, which is a nested structure.</p><p>Example JSON formats: records: \[{ colA: valueA1, colB: valueB1 }<!-- -->, { colA: valueA2, colB: valueB2 }<!-- -->\]</p><p>columnar: { colA: \[valueA1, valueA2\], colB: \[valueB1, valueB2\] }</p><p>array: \["value1", "value2"\]</p><p>values: \[ \["colA", "colB"\], \["valueA1", "valueA2"\], \["valueA2", "valueB2"\] \]</p> |
|  [DataType](./schema.datatype.md) | Explicit data type of the value (i.e., for a column or property). TODO: clarify/update null/undefined |
|  [DateComparisonOperator](./schema.datecomparisonoperator.md) |  |
|  [ErrorCode](./schema.errorcode.md) |  |
|  [FieldAggregateOperation](./schema.fieldaggregateoperation.md) | This is the subset of aggregate functions that can operate on a single field so we don't accommodate additional args. See https://uwdata.github.io/arquero/api/op\#aggregate-functions |
|  [FilterCompareType](./schema.filtercomparetype.md) | Indicates the comparison type used for a filter operation. This is done on a row-by-row basis. |
|  [JoinStrategy](./schema.joinstrategy.md) |  |
|  [KnownProfile](./schema.knownprofile.md) |  |
|  [KnownRel](./schema.knownrel.md) |  |
|  [MathOperator](./schema.mathoperator.md) |  |
|  [MergeStrategy](./schema.mergestrategy.md) |  |
|  [NumericComparisonOperator](./schema.numericcomparisonoperator.md) |  |
|  [ParseType](./schema.parsetype.md) | This is a subset of data types available for parsing operations. |
|  [SetOp](./schema.setop.md) | Indicates the type of set operation to perform across two collections. |
|  [SortDirection](./schema.sortdirection.md) |  |
|  [StringComparisonOperator](./schema.stringcomparisonoperator.md) |  |
|  [VariableNature](./schema.variablenature.md) | Describes the semantic shape of a variable. This has particular effect on how we display and compare data, such as using line charts for continuous versus bar charts for categorical. This mostly applies to numeric variables, but strings for instance can be categorical. |
|  [Verb](./schema.verb.md) |  |
|  [WindowFunction](./schema.windowfunction.md) | These are operations that perform windowed compute. See https://uwdata.github.io/arquero/api/op\#window-functions |

## Functions

|  Function | Description |
|  --- | --- |
|  [createCodebookSchemaObject(input)](./schema.createcodebookschemaobject.md) |  |
|  [createDataPackageSchemaObject(input)](./schema.createdatapackageschemaobject.md) |  |
|  [createDataTableSchemaObject(input)](./schema.createdatatableschemaobject.md) |  |
|  [createSchemaValidator()](./schema.createschemavalidator.md) |  |
|  [createTableBundleSchemaObject(input)](./schema.createtablebundleschemaobject.md) |  |
|  [createWorkflowSchemaObject(input)](./schema.createworkflowschemaobject.md) |  |

## Interfaces

|  Interface | Description |
|  --- | --- |
|  [AggregateArgs](./schema.aggregateargs.md) |  |
|  [BasicInput](./schema.basicinput.md) | Single-input, single-output step I/O |
|  [Bin](./schema.bin.md) | Describes a data bin in terms of inclusive lower bound and count of values in the bin. |
|  [BinArgs](./schema.binargs.md) |  |
|  [BinarizeArgs](./schema.binarizeargs.md) |  |
|  [BooleanArgs](./schema.booleanargs.md) |  |
|  [BundleSchema](./schema.bundleschema.md) | A schema for defining custom bundle types. |
|  [Category](./schema.category.md) | Describes a nominal category in terms of category name and count of values in the category. |
|  [CodebookSchema](./schema.codebookschema.md) | This contains all of the field-level details for interpreting a dataset, including data types, mapping, and metadata. Note that with persisted metadata and field examples, a dataset can often be visualized and described to the user without actually loading the source file. resource profile: 'codebook' |
|  [Constraints](./schema.constraints.md) | Validation constraints for a field. |
|  [ConvertArgs](./schema.convertargs.md) |  |
|  [CopyArgs](./schema.copyargs.md) |  |
|  [Criterion](./schema.criterion.md) |  |
|  [DataPackageSchema](./schema.datapackageschema.md) | Defines a Data Package, which is a collection of data resources such as files and schemas. Loosely based on the Frictionless spec, but modified where needed to meet our needs. https://specs.frictionlessdata.io/data-package/ |
|  [DataShape](./schema.datashape.md) | Defines parameters for understanding the logical structure of data contents. |
|  [DataTableSchema](./schema.datatableschema.md) | This defines the table-containing resource type. A dataset can be embedded directly using the <code>data</code> property, or it can be linked to a raw file using the <code>path</code>. If the latter, optional format and parsing options can be applied to aid interpreting the file contents. resource profile: 'datatable' |
|  [DeriveArgs](./schema.deriveargs.md) |  |
|  [DestructureArgs](./schema.destructureargs.md) |  |
|  [DualInput](./schema.dualinput.md) | Dual-input, single-output step I/O |
|  [EncodeDecodeArgs](./schema.encodedecodeargs.md) |  |
|  [EraseArgs](./schema.eraseargs.md) |  |
|  [Field](./schema.field.md) | Contains the full schema definition and metadata for a data field (usually a table column). This includes the required data type, various data nature and rendering properties, potential validation rules, and mappings from a data dictionary. |
|  [FieldError](./schema.fielderror.md) |  |
|  [FieldMetadata](./schema.fieldmetadata.md) | Holds core metadata/stats for a data field. |
|  [FillArgs](./schema.fillargs.md) |  |
|  [FilterArgs](./schema.filterargs.md) |  |
|  [FoldArgs](./schema.foldargs.md) |  |
|  [ImputeArgs](./schema.imputeargs.md) |  |
|  [InputColumnArgs](./schema.inputcolumnargs.md) |  |
|  [InputColumnListArgs](./schema.inputcolumnlistargs.md) | Base interface for a number of operations that work on a column list. |
|  [InputColumnRecordArgs](./schema.inputcolumnrecordargs.md) |  |
|  [InputKeyValueArgs](./schema.inputkeyvalueargs.md) |  |
|  [JoinArgs](./schema.joinargs.md) |  |
|  [JoinArgsBase](./schema.joinargsbase.md) |  |
|  [LookupArgs](./schema.lookupargs.md) |  |
|  [MergeArgs](./schema.mergeargs.md) |  |
|  [Named](./schema.named.md) | Base interface for sharing properties of named resources/objects. |
|  [OnehotArgs](./schema.onehotargs.md) |  |
|  [OrderbyArgs](./schema.orderbyargs.md) |  |
|  [OrderbyInstruction](./schema.orderbyinstruction.md) |  |
|  [OutputColumnArgs](./schema.outputcolumnargs.md) |  |
|  [ParserOptions](./schema.parseroptions.md) | Parsing options for delimited files. This is a mix of the options from pandas and spark. |
|  [PivotArgs](./schema.pivotargs.md) |  |
|  [PrintArgs](./schema.printargs.md) |  |
|  [RecodeArgs](./schema.recodeargs.md) |  |
|  [RelationshipConstraint](./schema.relationshipconstraint.md) |  |
|  [ResourceSchema](./schema.resourceschema.md) | Parent class for any resource type understood by the system. Any object type that extends from Resource is expected to have a standalone schema published. For project state, this can be left as generic as possible for now. |
|  [RollupArgs](./schema.rollupargs.md) |  |
|  [SampleArgs](./schema.sampleargs.md) |  |
|  [SpreadArgs](./schema.spreadargs.md) |  |
|  [StepJsonCommon](./schema.stepjsoncommon.md) | Common step properties |
|  [StringsArgs](./schema.stringsargs.md) |  |
|  [StringsReplaceArgs](./schema.stringsreplaceargs.md) |  |
|  [TableBundleSchema](./schema.tablebundleschema.md) | <p>A table bundle encapsulates table-specific resources into a single resource with a prescribed workflow.</p><p>A tablebundle requires a <code>source</code> entry with rel="input" for the source table. A tablebundle may also include <code>source</code> entries with rel="codebook" and rel="workflow" for interpretation and processing of the source data table.</p> |
|  [TypeHints](./schema.typehints.md) | Configuration values for interpreting data types when parsing a delimited file. By default, all values are read as strings - applying these type hints can derive primitive types from the strings. |
|  [UnhotArgs](./schema.unhotargs.md) |  |
|  [UnknownInput](./schema.unknowninput.md) |  |
|  [ValidationResult](./schema.validationresult.md) |  |
|  [VariadicInput](./schema.variadicinput.md) | Multi-input, single output step I/O |
|  [WindowArgs](./schema.windowargs.md) |  |
|  [WorkflowArgs](./schema.workflowargs.md) |  |
|  [WorkflowSchema](./schema.workflowschema.md) | The root wrangling workflow specification. resource profile: 'workflow' |

## Variables

|  Variable | Description |
|  --- | --- |
|  [DataTableSchemaDefaults](./schema.datatableschemadefaults.md) |  |
|  [LATEST\_CODEBOOK\_SCHEMA](./schema.latest_codebook_schema.md) |  |
|  [LATEST\_DATAPACKAGE\_SCHEMA](./schema.latest_datapackage_schema.md) |  |
|  [LATEST\_DATATABLE\_SCHEMA](./schema.latest_datatable_schema.md) |  |
|  [LATEST\_TABLEBUNDLE\_SCHEMA](./schema.latest_tablebundle_schema.md) |  |
|  [LATEST\_WORKFLOW\_SCHEMA](./schema.latest_workflow_schema.md) |  |
|  [ParserOptionsDefaults](./schema.parseroptionsdefaults.md) |  |
|  [TypeHintsDefaults](./schema.typehintsdefaults.md) | This is a collection of default string values for inferring strict types from strings. They replicate the defaults from pandas. https://pandas.pydata.org/pandas-docs/stable/user\_guide/io.html\#csv-text-files |

## Type Aliases

|  Type Alias | Description |
|  --- | --- |
|  [DedupeArgs](./schema.dedupeargs.md) |  |
|  [DropArgs](./schema.dropargs.md) |  |
|  [FactoryInput](./schema.factoryinput.md) |  |
|  [GroupbyArgs](./schema.groupbyargs.md) |  |
|  [InputBinding](./schema.inputbinding.md) |  |
|  [Profile](./schema.profile.md) | Resources must have a profile, which is a key defining how it should be interpreted. Profiles are essentially shorthand for a schema URL. The core profiles for DataShaper are defined here, but any application can define one as a string. |
|  [Rel](./schema.rel.md) | A rel is a string that describes the relationship between a resource and its child. |
|  [RenameArgs](./schema.renameargs.md) |  |
|  [SelectArgs](./schema.selectargs.md) |  |
|  [Step](./schema.step.md) | Specification for step items |
|  [UnfoldArgs](./schema.unfoldargs.md) |  |
|  [UnrollArgs](./schema.unrollargs.md) |  |
|  [ValidationFunction](./schema.validationfunction.md) |  |
|  [Value](./schema.value.md) | A cell/property value of any type. |
|  [WorkflowInput](./schema.workflowinput.md) |  |
|  [WorkflowStepId](./schema.workflowstepid.md) | The Id of the step to which the input is bound |