The challenges in teaching machines to see like humans
AI machines being trained with a large number of different website templates can cluster and display distinct web page layouts for each user group. Sounds simple, but how do we get machines to view visuals and layout patterns like humans? There are a few challenges to overcome before achieving this.
Data Collection
For the machine to recognize different page templates, there are two key data categories that we are interested in, namely the page screenshot in image form and elements position in JavaScript Object Notation (JSON) format.
Next, we would need to decide how many web pages are we targeting as the input. Then triple the datasets by varying the viewport size of the page, which includes desktop, tablet, and mobile view.
Results
- After the machine performs its extensive training and generates the result, how do we decide if the results are accurate?
- Machine learning for numerical/categorical data (Supervised Learning) often requires the preparation of another smaller set of data to be labelled (validation data)
- This is acceptable as the researchers already identify what class to expect from the data
Underfitting or Overfitting
- Underfitting occurs when the model is too simplified to understand the base structure of data, having less information to construct an accurate training model which in layman’s terms, tries to fit in undersized pants
- Overfitting is the complete opposite scenario, where researchers tend to overgeneralize
Feature extraction and suitable machine learning model
- When dealing with large datasets with high dimensionality (a large amount of columns and variables), feature extraction is important to further divide and reduce the existing data into a manageable group.
- It involves getting the features that the researchers deemed suitable and combining them while retaining the originality of the data
Comments ()