This guide walks you through the process of importing an existing AWS EKS cluster into the Simplismart platform. By importing your cluster, you can leverage Simplismart’s deployment, monitoring, and scaling capabilities with your existing infrastructure.
Coming Soon: Warmpool functionality and automatic node group creation are currently only available for clusters created directly on the Simplismart platform.In an upcoming release, these features will be extended to imported clusters. You’ll be able to provide limited IAM access, and Simplismart will manage warmpool and node group creation automatically, just like it does for Simplismart-created clusters.
Your cluster must have at least one node group with minimum 1 vCPU and the required label (see below). For auxiliary node groups where Simplismart cluster tools will be installed, use a minimum machine size of m6a.xlarge.
All nodes in your cluster must have the required label.To add a label, navigate to Amazon Elastic Kubernetes Service > Clusters > <cluster_name> > <node_group> > edit and add a label in the format given below:
Copy
Ask AI
simplismart.ai/node-group-name: <node-group-name>
This label is essential for the Simplismart platform to identify and manage your nodes correctly.
If you are importing a cluster, you need to create a node group yourself on AWS. Only then you can register it on the Simplismart platform. See the Create AWS EKS Cluster guide for detailed instructions on creating node groups with the required label.
Configure your Kubernetes credentials as a secret in Simplismart to authenticate with your AWS EKS Cluster. AWS supports both token-based and certificate-based authentication. This guide covers token-based authentication using three different approaches.
Important Timing Constraint: AWS EKS tokens have a 15-minute validity period. Since cluster import on Simplismart takes approximately 10 minutes, you must generate the kubeconfig credentials and start the import process within 5 minutes. If the token expires during import, the cluster import will fail with an authentication error. If this happens, simply regenerate the credentials and retry the import.
Method 1: Using AWS CLI
This method uses the AWS CLI to generate kubeconfig credentials programmatically.
1
Configure AWS CLI
Ensure you have AWS CLI installed and configured with appropriate credentials:
Copy
Ask AI
aws configure
Enter your AWS Access Key ID, Secret Access Key, default region, and output format when prompted.
2
Update Kubeconfig
Update your local kubeconfig file to include the EKS cluster credentials:
Replace <region> with your AWS region (e.g., us-east-1) and <cluster-name> with your EKS cluster name.This command will add the cluster context to your ~/.kube/config file.
3
Verify Connection
Test the connection to your cluster:
Copy
Ask AI
kubectl get nodes
If successful, you should see a list of nodes in your cluster.
4
Extract Kubeconfig
Extract the kubeconfig content to add as a secret in Simplismart:
Copy
Ask AI
cat ~/.kube/config
Copy the entire output. You’ll use this in the next step.
Add a new secret with the kubeconfig content you copied
Follow the detailed configuration instructions in the Secrets documentation
For production environments, it’s recommended to create a dedicated IAM role with minimal required permissions for Simplismart access, rather than using admin credentials.
Collects basic pod and node resource usage (CPU/memory) to enable Kubernetes Horizontal Pod Autoscaling and efficient resource management.
Cluster Autoscaler
Automatically adds or removes nodes in your cluster based on workload demand, optimizing costs by scaling infrastructure dynamically.
Prometheus Adapter
The Prometheus Adapter is a fallback in case we want to switch to use HPA and scale based on number of requests or custom metrics.
KEDA
Our main event-driven autoscaler that scales applications based on workload activity (e.g., queue length, message count), improving performance and reducing costs during idle periods. It is required for scale to 0.
Long-term metrics storage system that provides reliable, scalable storage for time-series data, enabling historical analysis and trend monitoring.
Loki
Centralized log aggregation system that collects and stores logs from all applications, making it easy to search, filter, and debug issues across your cluster.
Promtail
Log shipping agent that collects logs from your applications and forwards them to Loki, making all logs searchable in the Simplismart platform.
Grafana Prometheus Stack
Complete monitoring solution providing dashboards, metrics collection, and alerting capabilities to monitor workload health and system performance.
DCGM Exporter
GPU monitoring tool that tracks NVIDIA GPU utilization, temperature, and health metrics, essential for optimizing AI/ML workloads.
Simplismart Agent
Simplismart’s internal monitoring agent that collects operational metrics and system health data like disk pressure, node readiness, degraded pod, etc for platform integration.
3
Register Node Groups
Configure your node groups to be managed by the Simplismart platform. Node group configuration allows you to manage your cluster resources based on workload types, hardware requirements, and scaling policies.Using the given configuration, Simplismart is able to intelligently manage the nodes and effectively distribute the resources.
Provide the node group label you configured earlier in the Prerequisites section. This label allows Simplismart to identify and manage your node group.
You can register multiple node groups with different configurations to use your cluster resources effectively on the Simplismart platform.
Enable Use Node Group Configuration to have Simplismart automatically allocates the resource during the deployment. If not selected, you need to provide node configuration every time you deploy.
Field
Description
Accelerator Type
Select either CPU or GPU based on your workload requirements
Accelerator Count
Number of accelerators (GPUs) per node
Min Node Count
Minimum number of nodes to maintain in this node group
Max Node Count
Maximum number of nodes allowed in this node group for autoscaling
CPU
Number of CPU cores per node
Memory
Memory allocation in GB per node
When this option is enabled, Simplismart will manage the specified node group in your EKS cluster based on the resource configuration provided. The node group must already exist in your cluster.
Enable Mark as Auxiliary if this node group should be reserved for supporting workloads rather than primary AI/ML operations.Common use cases for auxiliary node groups:
Monitoring and logging services (refer to Step 2 for details)
Internal tooling and platform services
4
Add Tags (Optional)
Add custom tags to manage and identify your cluster for billing, cost allocation, and resource management. Tags will be auto-populated in the billing section as well as the event center for better tracking and visualization. Moreover, you can create environment tags for Testing, Staging or Production based on your requirements. Environment specific tags will help you in billing and how much each environment is generating the bills.Click Add Tag and provide key-value pairs as needed.
Imported clusters currently support container-based deployments. You can deploy Docker/Depot containers with full integration into the Simplismart platform, including:
Monitoring via the observability stack (Grafana, Prometheus, Loki)
Auto-scaling via the scalability stack (Metrics Server, Cluster Autoscaler, KEDA)