Datasets in BigML store statistical information about the contents of the fields detected in the Source resource: the number of rows, the cases where there's missing information and the errors (like numeric fields that contain words instead of numbers). Also the histograms that describe the distributions of values in every field, and the non-preferred marks assigned to fields that are constant or unique IDs (both useless for modeling purposes in general).
To learn about all their available configuration options using BigML's Dashboard, you can check the corresponding section of the Datasets documentation. Also the Datasets API documentation contains the information about the attributes that need to be specified to configure your source programmatically.