Implementing AI and ML Solutions comprises two phases:
Phase:1 Building model, training model, testing model and hyper tuning
Phase:2 Deployment model in a production environment
For Phase-1, we need to set cloud base cluster of servers with GPU/TPU to enable machine (Although you can set one machine). It'll take time for training in case of an image, text-based data processing with one machine. To reduce the training time, it's advisable to use a cluster of servers for distributed service. following factors need to calculate in the training phase
- Costing for the numbers of devices for the training phase – Hourly base
- Network I/O
- Costing for the numbers of devices for the prediction phase