Skip to Main Content
 

Blog title

An Automatic DDL Strategy for Your Synthetic Test Data

INFO_LIST

  • Author Head Binubuo
  • Category Binubuo API
  • Created Wed, Feb 15 2023
 

blog sections

There are a few things certain in software development. We have a tendency to over-engineer our solution, We create tech debt out of the gate when we rush the first few versions. Not following practice and common sense; we never think about creating synthetic test data for our data store until we are supposed to have it.

This is where Binubuo can help you. You can create dataset generators that can mimic your data and patterns down to the last detail, if that is needed, but there is a faster (and maybe good enough for you) solution, where you do not have to create the dataset JSON definition at all. Binubuo can take advantage of a feature that most people really does not use when they create their DDL: Table Comments.

Binubuo can utilise your table and column comments in your database, when running the quick_fetch_table command in the python client. This means that we can instantly get test data into our table, that emulates what you are expecting in your production setup.

An example of comments

Most databases support comments on tables and columns, so when creating a table you can add them with the comment on command (syntax and command may differ). Below is a create table statement in Postgres:

create table test_comments (
  X int
  , Y text
);

Binubuo will check the comments on the table when you want to get synthetic test data like it. So if we add a comment for each of the columns that tells Binubuo what generator to use, we can very quickly get exactly the kind of test data we are looking for.

comment on test_comments.x is ‘small_number’;
comment on test_comments.y is ‘word’;

Using Binubuo to add comments

The nice thing, is that in Binubuo have added an interface for adding the comments, so it is the same across all the supported databases.

Example adding just comma separated

>>> from binubuo import binubuoPostgres
>>> bo = binubuoPostgres('YOUR_API_KEY', 'DATABASE_NAME', 'USERNAME')
>>> bo.connect()
>>> bo.columnGeneratorComments('test_comments', 'small_number,word')
-> Create comments on table [test_comments]. Comments=small_number,word.
  -> Comments for table [test_comments] successfully created.

Example of adding named columns

>>> from binubuo import binubuoPostgres
>>> bo = binubuoPostgres('YOUR_API_KEY', 'DATABASE_NAME', 'USERNAME')
>>> bo.connect()
>>> bo.columnGeneratorComments('test_comments', 'y=word,x=small_number')
-> Create comments on table [test_comments]. Comments=y=word,x=small_number.
  -> Comments for table [test_comments] successfully created.

Using Binubuo quick_fetch and Comments to create synthetic data

Once the comments are in place, you can immediately start to create your test data. By default quick_fetch_table will use the comments if they are available.

>>> from binubuo import binubuoPostgres
>>> bo = binubuoPostgres('YOUR_API_KEY', 'DATABASE_NAME', 'USERNAME')
>>> bo.connect()
>>> bo.quick_fetch_table('test_comments')
[[353, 'cakuh'], [613, 'ji'], [504, 'dipfi'], [974, 'letojge'], [516, 'cuvufeg'], [523, 'mujof'], [109, 'zoholod'], [615, 'rapij'], [782, 'wate'], [407, 'comgis']]

>>>

Inserting data immediately

Using the output_type parameter of quick_fetch_table you can get synthetic data into your table immediately. Just after creating your table and adding your comments, simply call quick_fetch_table, instructing the client to directly insert the results into the table.

So if we wanted 250 rows in our table, we could call quick_fetch_table the following way:

>>> from binubuo import binubuoPostgres
>>> bo = binubuoPostgres('YOUR_API_KEY', 'DATABASE_NAME', 'USERNAME')
>>> bo.connect()
>>> bo.tableSize('test_comments')
0
>>> bo.quick_fetch_table(table_name='test_comments', output_type='table', rows=250)
>>> bo.tableSize('test_comments')
250
>>> bo.table_sample('test_comments')
**************************************************************************
** Table data sample: TEST_COMMENTS
**************************************************************************
              x              y
 ============== ==============
            320        volteli
            578             we
            460          tozed
>>>

Changing the comments to change the data generated

Whenever your data requirements change or your column definitions change, you can change the comments so the random data generated follows your changes as well. Simply set the new generators in the comments and you are set.

So we change the column type of column x to float

ALTER TABLE test_comments
ALTER COLUMN x TYPE float;

and then we can change the comments. Let us make X a small amount and let us make Y into a name.

>>> from binubuo import binubuoPostgres
>>> bo = binubuoPostgres('YOUR_API_KEY', 'DATABASE_NAME', 'USERNAME')
>>> bo.connect()
>>> bo.columnGeneratorComments('test_comments', 'y=full_name,x=small_amount')
-> Create comments on table [test_comments]. Comments=y=full_name,x=small_amount.
  -> Comments for table [test_comments] successfully created.
>>> bo.quick_fetch_table('test_comments')
[[700.25, 'Eva Anderson'], [649.96, 'Scarlett Ramirez'], [762.37, 'Carson Hayes'], [358.69, 'Diego Simmons'], [429.53, 'Joshua Diaz'], [453.23, 'Avery Bell'], [725.86, 'Dominic Edwards'], [989.32, 'Parker Thomas'], [222.24, 'Ian Rogers'], [88.45, 'Brooklyn Richardson']]
>>> bo.quick_fetch_table(table_name='test_comments', output_type='table', rows=250)
>>> bo.tableSize('test_comments')
250
>>> bo.table_sample('test_comments')
**************************************************************************
** Table data sample: TEST_COMMENTS
**************************************************************************
              x              y
 ============== ==============
         404.89 Alexander Garc
         675.43 Mackenzie Walk
         608.89 Zachary Gonzal
>>>

Don't have an account yet?

If you don't have an account on Binubuo yet, you can create one real quick. Just click "Prices and Registration" in the top right corner, and you are on your way to create all the synthetic data you could dream about.

Want to see how to get started: Get Started Guide

Get more guides and help from the blog

Youtube Channel

Follow Binubuo on Twitter:


If you already have an account on RapidAPI, you can use your account to access Binubuo

Connect on RapidAPI