Differences

This shows you the differences between two versions of the page.

--- tutorials:grolink-on-kubernetes [2024/11/29 20:29] – [Send first API request] tim
+++ tutorials:grolink-on-kubernetes [2024/12/02 15:38] (current) – [Generate input data with SALib] tim
@@ Line 68: / Line 68: @@
 The namespace defines which nodes we want to address or what roles are used.
-In our case the only additional role we need is that a pod can see other pods (so our terminal node can find the GroLink nodes)
+In our case the only additional role we need is that a pod can see other pods (so our terminal node can find the GroLink nodes). We can just use the role named "system:node".
-This role is defined as following and shoud be stored in role.yaml:
+Therefore we need a ClusterRoleBinding that defines who has this role, in our case we just bind it to default because this is the simplest. The following should be stored in rolebinding.yaml
-<code yaml>
-apiVersion: rbac.authorization.k8s.io/v1
-kind: Role
-metadata:
-  namespace: grolinktutorial
-  name: pod-reader
-rules:
-- apiGroups: [""] # "" indicates the core API group
-  resources: ["pods"]
-  verbs: ["get", "watch", "list"]
-</code>
-And additionally we need a ClusterRoleBinding that binds who has this role, in our case we just bind it to default because this is the simplest. The following shoud be stored in rolebinding.yaml
 <code yaml>
 apiVersion: rbac.authorization.k8s.io/v1
@@ Line 92: / Line 79: @@
   apiGroup: rbac.authorization.k8s.io
   kind: ClusterRole
-  name: pod-reader
+  name: system:node
 subjects:
   - kind: ServiceAccount
@@ Line 102: / Line 89: @@
 <code bash>
 kubectl apply -f role.yaml
-kubectl apply -f roleBinding.yaml
+kubectl apply -f rolebinding.yaml
 </code>
@@ Line 139: / Line 126: @@
 This can be executed for a file similar to the role and the Binding:
 <code bash>
-kubectl apply -f grolinkDeply.yml
+kubectl apply -f grolinkDeploy.yaml
 </code>
-For our terminal pod we are not very picky we just need a pod that runs for ever and can execute python code. Therefore we can just use the default python image and let it run the embedded web server(we don't need this server at all, but if the node is not busy is dies).
+For our terminal pod we are not very picky we just need a pod that runs for ever and can execute python code. Therefore we can just use a python image with some dependencies installed and let it run the embedded web server(we don't need this server at all, but if the node is not busy is dies).
 <code yaml>
 apiVersion: apps/v1
@@ Line 164: / Line 151: @@
       containers:
       - name: terminal
-        image: python
+        image: registry.gitlab.com/grogra/groimp-models/sensi1/gropysensbase:latest
         args: ["python","-m", "http.server"]
 </code>
@@ Line 170: / Line 157: @@
 We need to also deply this:
 <code bash>
-kubectl apply -f terminalDeply.yml
+kubectl apply -f terminalDeploy.yaml
 </code>
@@ Line 210: / Line 197: @@
 Now we can use kubectl to copy the python file with the code from above on this pod:
 <code bash>
-kubectl -n grolinktutorial cp run.py terminal-<second-part-of-the-name>:/
+kubectl -n grolinktutorial cp run.py terminal-<second-part-of-the-name>:/app
 </code>
@@ Line 219: / Line 206: @@
 </code>
+===== Run a simulation =====
+Now with the existing connection we can run our model for the first time. Todo so we first need to copy our model to the terminal pod:
+<code bash>
+kubectl -n grolinktutorial cp model.gsz terminal-XXX:/app
+</code>
+And then we can open this project using the GroPy library, it is important to open it with the content of the gsz and not with the link to it because the API server runs on  another system.
+Then we can update and execute our file as we know it from other API examples:
+<code python>
+from GroPy import GroPy
+import kr8s
+podIPs=[]
+selector = {'app': 'grolink'}
+for podS in kr8s.get("pods", namespace="grolinktutorial", label_selector=selector):
+    podIPs.append(podS.status.podIP)
+#create link to the first pod
+link = GroPy.GroLink("http://"+podIPs[0]+":58081/api/")
+#open the workbench with a POST request
+wb = link.openWB(content=open("model.gsz",'rb').read()).run().read()
+# change the parameters fo the simulation
+wb.updateFile("param/parameters.rgg",bytes("""
+            static float lenV=4;
+            static float angle=12;
+            """,'utf-8')).run()
+wb.compile().run()
+#execute the run function
+data = wb.runRGGFunction("run").run().read()
+print(data)
+#close the workbench
+wb.close().run()
+</code>
+If we then update our run.py function on the terminal pod and execute it we can get a result of: ''{'console': ['0.2'], 'logs': []}''
+===== Running on all pods =====
+To now use the potential of the cluster, we need to send requests to all the pods in parallel. This can be done from our one terminal pod using the python multiprocessing library.
+Using this library we can initialize a pool of "workers" that are each linked to one API server, depending on the number of workers several can be linked to one server, since the API server is multi-threaded.
+With this pool of workers we can than work through a list of parameter sets and push each set in a simulation and collect the results in a file.
+==== Generate input data with SALib ====
+To generate the input data we can use the saltelli.sample function from [[https://salib.readthedocs.io/en/latest|salib]]. This creates us a distribution of input datasets in the defined range:
+<code python>
+problem = {
+    'num_vars': 2,
+    'names': ['lenV', 'angle'],
+    'bounds': [[0.1, 1],[30, 70]]
+}
+param_values = saltelli.sample(problem, 2**4) # create parameter set
+</code>
+''%%2**4%%'' describes the number of input sets, we will define this very low by now (16) because we work on a simulated cluster and don't want to cause trouble.
+==== Linking processes/"workers" to API servers  ====
+First we create a list with links to all the API servers:
+<code python>
+links=[]
+selector = {'app': 'grolink'}
+for podS in kr8s.get("pods", namespace="grolinktutorial", label_selector=selector):
+    links.append(GroPy.GroLink("http://"+podS.status.podIP+":58081/api/"))
+</code>
+Then we can use this list to create a queue long enough to "feed" all "workers" with the links.
+<code python>
+WORKERCOUNT =9
+pods = multiprocessing.Queue()
+n = len(links)
+for i in range(0,WORKERCOUNT):
+    pods.put(links[i%n])
+</code>
+This queue is required so that the workers can be initialized in parallel using the following function:
+<code python>
+#initialize each worker
+def init_worker(function,queue ):
+    function.cursor = queue.get().openWB(content=open("model.gsz",'rb').read()).run().read()
+</code>
+The function.cursor will then later be defined for each worker, by emptying the given queue.
+==== The actual function ====
+The actual growth function is no much different to the one we used above to test our model for the first time. Only that we can use the variable grow.cursor as a workbench because we know already that it will be initialized in that way. And we only get one tuple as an input parameter from the ASlib function, so we split it in the first line:
+<code python>
+# the actual execution
+def grow(val):
+    lenV, angle = val
+    results = []
+    #overwrite the parameters in the file
+    grow.cursor.updateFile("param/parameters.rgg",bytes("""
+            static float lenV="""+str(lenV)+""";
+            static float angle="""+str(angle)+""";
+            """,'utf-8')).run()
+    grow.cursor.compile().run()
+    for x in range(0,10): #execute 10 times
+        data=grow.cursor.runRGGFunction("run").run().read()
+        results.append(float(data['console'][0]))
+    return results
+</code>
+==== Running and saving ====
+In the final step we initialize a multiprocessing pool using the init_worker function, with the grow function and pods queue as parameters and map this pool on the generated input values.
+Finally we can transfrom and save our result in an csv file.
+<code python>
+# Multi processing
+pool = multiprocessing.Pool(processes=WORKERCOUNT,initializer=init_worker, initargs=(grow,pods,))
+results = pool.map(grow,param_values)
+pool.close()
+y = np.array(results)
+# save result
+np.savetxt("result.csv", y, delimiter=",")
+</code>
+==== Running It ====
+After we put this all together and run it as we did above, we can read our csv file through our terminal pod:
+<code bash>
+kubectl -n grolinktutorial exec  terminal-XXXX cat  result.csv
+</code>
+For simplicity you can find the last python code here in one file:
+<code python>
+import numpy as np
+from SALib.sample import saltelli
+from GroPy import GroPy
+import multiprocessing
+import kr8s
+from kr8s.objects import Pod
+WORKERCOUNT =9
+# defining the problem
+problem = {
+    'num_vars': 2,
+    'names': ['lenV', 'angle'],
+    'bounds': [[0.1, 1],[30, 70]]
+}
+param_values = saltelli.sample(problem, 2**2) # create parameter set
+#creating a link for each pod
+links=[]
+selector = {'app': 'grolink'}
+for podS in kr8s.get("pods", namespace="grolinktutorial", label_selector=selector):
+    print("x"+podS.status.podIP)
+    links.append(GroPy.GroLink("http://"+podS.status.podIP+":58081/api/"))
+# create an queue to assign pods to workers
+pods = multiprocessing.Queue()
+n = len(links)
+for i in range(0,WORKERCOUNT):
+    pods.put(links[i%n])
+#initialize each worker
+def init_worker(function,pods ):
+    function.cursor = pods.get().openWB(content=open("model.gsz",'rb').read()).run().read()
+# the actual execution
+def grow(val):
+    lenV, angle = val
+    results = []
+    #overwrite the parameters in the file
+    grow.cursor.updateFile("param/parameters.rgg",bytes("""
+            static float lenV="""+str(lenV)+""";
+            static float angle="""+str(angle)+""";
+            """,'utf-8')).run()
+    grow.cursor.compile().run()
+    for x in range(0,10): #execute 10 times
+        data=grow.cursor.runRGGFunction("run").run().read()
+        results.append(float(data['console'][0]))
+    return results
+# Multi processing
+pool = multiprocessing.Pool(processes=WORKERCOUNT,initializer=init_worker, initargs=(grow,pods,))
+results = pool.map(grow,param_values)
+pool.close()
+y = np.array(results)
+# save result
+np.savetxt("result.csv", y, delimiter=",")
+</code>