14

Cloning Django models or adding a differentiator field in the second model?

 2 years ago
source link: https://www.codesd.com/item/cloning-django-models-or-adding-a-differentiator-field-in-the-second-model.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Cloning Django models or adding a differentiator field in the second model?

advertisements

I have to store live streaming 'test' data and training data in a mysql database. I need to make Django models for the same. Now the structure of the data is exactly the same i.e. time, value, label. The only difference between the two models is one would be used for training data and the other would hold the live test data (production data).

Which way would be a better way of creating models in terms of performance:

  1. Create two models TrainDataModel and TestDataModel.
  2. Create a single model 'Data' and add a boolean field say 'training' to indicate whether the data is part of a test/train dataset.

Now, training would be done at the initial stages and would be much smaller in size compared to the test data. Also, the amount of test data would be huge (~20-30GB).

Processing data involves running classification algorithms over the data collected. In my particular case, the training data would have to be accessed frequently for every classification task.

  • For the first case, I would have to query two tables. Querying training data would be fast as the data size would be really small.
  • For the second case, the DB would become huge which would affect the query response time but would have to access only a single table.

Which would be faster for my use case?

I am a newbie at database query optimizations. Therefore, suggestions/pointers would be appreciated. If there are any alternative way to do the same (other than two mentioned above), those suggestions are welcome as well.


Use model inheritance

class DataModel(model):
    time = ...
    value = ...
    label = ...

class TrainDataModel(DataModel):
    pass

class TestDataModel(DataModel):
    pass

And for optimization you can use indexes, and like Lara says visit django documentation


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK