Increasingly, organizations with an online platform, website, or mobile app are storing user and marketing data from their users and visitors in their own data lakes. In this way, they always have access to the data and are free in their choice of tools.
With smartocto Connect we offer the possibility to share this business data with our backend systems to use it in our tools to gain useful insights for your organization. Also, you are avoiding sharing any PII (Personally Identifiable Information) to smartocto, by sending pre-aggregated and anonymized data.
Data types
With smartocto Connect, both aggregated and raw data can be ingested. Not all tools from the smartocto Ecosystem need raw data to provide the right insights for your organization. By only sending us the types of data in the right composition, ensure that you can keep data costs low. The data must be delivered in JSON format. This lightweight data-interchange format offers the possibility of transferring the data securely, transparently, and efficiently. The table below shows which type of data composition smartocto Connect accepts per tool and what the minimum required delivery frequency is.
tool | type | frequency |
smartocto Insights | raw | 1 millisecond |
smartocto Realtime | aggregated/raw | 60 seconds |
smartocto Smartify | aggregated/raw | 60 seconds |
smartocto Targets | aggregated/raw | 60 seconds |
Smartocto Waves | aggregated/raw | 60 seconds |
You can see from the table above that only smartocto Insights will accept raw data as it’s created. This allows smartocto Insights to better understand the data without losing the information due to aggregation or timeloss. Raw data for smartocto Insights could be delivered on an hourly basis but the composition must be in milliseconds.
Aggregated data example
Here is a small example of what an aggregated document format looks like.
[{
"relationId": "4080775",
"platform": "web",
"medium": "referral",
"source": "www.tracesofwar.nl",
"timestamp": "2022-05-06T11:12:00+00:00",
"views": 10,
"subType": "dataLake 1",
"type": "pageViews"
},
{
"relationId": "4081433",
"platform": "app",
"medium": "social",
"source": "facebook",
"timestamp": "2022-05-06T11:12:00+00:00",
"views": 6,
"subType": "dataLake 2",
"type": "pageViews"
}]
Data delivery
smartocto Connect supports different ways of providing data:
- Ingestion API
- Bulk upload
- Kafka replication
For all three methods, the structure of the input data is strict and standardized. This ensures that we always understand what data we are receiving. This means that the data must be prepared and replicated in an agreed format. The format of the data differs per information property but is clearly described in our online documentation. If the supplied data is not in the correct format, the data will be processed as invalid. You will be informed about the problem via the monitoring system and in the return message directly at ingestion.
Ingestion API
The ingestion API allows you, after authentication, to deliver multiple JSON records at once via HTTP POST. This is the most accessible method of supplying smartocto Connect with data.
Bulk upload
Using bulk upload we provide the ability to upload compressed files in your own private S3 bucket that we create on sign-up and host for you. S3 is a long-term, static storage engine that is globally distributed by Amazon for redundancy. Data will be processed from this bucket and stored in the smartocto backend systems. The process of picking up, processing, and writing into the different systems takes time. This can lead to visible delays inside the real-time dashboards. This type of data delivery is therefore less suitable for smartocto Realtime and smartocto Waves.
Kafka replication
Using Kafka replication will require some configuration on both sides, but if you have a Kafka cluster running where all the data is put on this option can be used. We use the internal Kafka authentication system to process the data stream securely and without uninvited guests.
Ingestion Monitoring Platform Dashboard
The Ingestion Monitoring Platform Dashboard will provide feedback about the number of (bulk) files and data consumed, processed, and failed. Each customer that delivers information to smartocto Connect gets its own status page, describing the latest processing information.
In an Ingestion Monitoring Platform Dashboard, you are able to see the time difference between delivery and final processing. This way action can be taken, for example, to inform you (the customer) that the (bulk) files need to be smaller or that something is out of sync. It will also be possible to send notifications to certain stakeholders.
Good agreements will have to be made upfront, so that the right people can be reached and action can be taken, especially outside office hours.