Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions docs/examples/getting_started/Dockerfile
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

several members of my team work almost exclusively in R, so it was important to establish the ability to run simple R scripts rather than simple python scripts. As such, to test this I created a simple dockerfile and corresponding helloworld script that you could consider including in this PR, or I can include in my own PR

Original file line number Diff line number Diff line change
@@ -1,2 +1,11 @@
FROM python:3.11-slim
RUN pip install pandas scikit-learn duckdb
# Use the official Python 3.11 image
FROM python:3.11

# Set the working directory inside the container
WORKDIR /app

# Install Python dependencies
RUN pip install --no-cache-dir pandas duckdb

# Define the command to run your application
CMD ["python3", "-c", "print('Hello!')"]
47 changes: 38 additions & 9 deletions docs/examples/getting_started/cloudclient_walkthrough.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
"id": "b60f671d",
"metadata": {},
"source": [
"The initialization below is the simplest way to create and instance of the `CloudClient` class. If a variable called AZURE_KEYVAULT_NAME is saved to your environment, the `CloudClient` will initialize based on some Azure values stored in the Key Vault. Otherwise it will use environment variables or values stored in a .env file to authenticate, like the .env file stored [here](../../files/sample.env), and a managed identity credential based on your local working environment. The .env file should be stored at the same level in the directory in which you're working."
"The initialization below is the simplest way to create and instance of the `CloudClient` class. If a variable called AZURE_KEYVAULT_NAME is saved to your environment, the `CloudClient` will initialize based on some Azure values stored in the Key Vault. Otherwise it will use environment variables or values stored in a .env file to authenticate, like the .env file stored [here](../../files/sample.env), and a managed identity credential based on your local working environment. The .env file should be stored at the same level in the directory in which you're working. **Make sure to update your .env file based on the sample with values relevant to your Azure environment.**"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think I understand what this direction is telling the user to to do. Is it saying you can have a .env file with a single line AZURE_KEYVAULT_NAME = "CFA-Predict"? Is it saying you should manually set a variable AZURE_KEYVAULT_NAME = "CFA-Predict" prior to running CloudClient()? Neither one worked for me. In both cases I get AttributeError: A non-None value for attribute azure_batch_account is required to obtain a value for Azure batch endpoint URL.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It did work to use cc = CloudClient(keyvault = 'CFA-Predict'), but if that is the easiest way to do things, why does it appear commented out and second in the walkthrough? This line implies there is a way to do the same thing by having "a variable called AZURE_KEYVAULT_NAME is saved to your environment" but neither way I could think to do that worked

]
},
{
Expand All @@ -61,7 +61,7 @@
"id": "a2643149",
"metadata": {},
"source": [
"We could also specify the Key Vault directly."
"We could also specify the Key Vault directly. If a Key Vault is specified, a .env file is no longer needed. This is the easiest way to authenticate using CFA's Key Vault."
]
},
{
Expand Down Expand Up @@ -139,7 +139,24 @@
"\n",
"There are plenty of times when local files would need to be uploaded to Blob Storage. Files can be referenced from within a running job via a mount in the pool. Scripts in Blob Storage can also be referenced in the command line for the task execution.\n",
"\n",
"For example, we have the `main.py` file that we want to upload to the Blob container 'input-test' in order to use it for a future task. The following code will upload to the root of the specified container."
"For example, we have the `main.py` file that we want to upload to the Blob container 'input-test' in order to use it for a future task. The following code will upload to the root of the specified container. *Note that the container must already exist in Blob Storage.*\n",
"\n",
"For experimentation, you should create a new testing container (like \"input-test-<username>\" for example) and be sure not to overwrite anything important. "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fe7a4c5b",
"metadata": {
"vscode": {
"languageId": "plaintext"
}
},
"outputs": [],
"source": [
"# uncomment below and input your username to create a new blob container\n",
"#cc.create_blob_container(\"input-test-<username>\")"
]
},
{
Expand All @@ -153,6 +170,7 @@
},
"outputs": [],
"source": [
"# upload main.py to container \"input-test\"\n",
"cc.upload_files(\n",
" \"main.py\",\n",
" container_name = \"input-test\"\n",
Expand All @@ -164,9 +182,11 @@
"id": "1f65c7b6",
"metadata": {},
"source": [
"## Upload Image to Container Registry\n",
"## Upload Image to Azure Container Registry\n",
"\n",
"Batch pools can use images from Azure Container Registry, GitHub Container Registry, or Docker Hub. Suppose we want to package a local Dockerfile (python image with a few requirements) and upload to the Azure Container Registry for use by the pool. This Dockerfile should exist at the root of your working directory, or you can specify the path to the Dockerfile. The following code would do the trick if your Dockerfile exists at the root of your working directory. Make sure to reference the correct registry name.\n",
"\n",
"Batch pools can use images from Azure Container Registry, GitHub Container Registry, or Docker Hub. Suppose we want to package the local Dockerfile (python image with a few requirements) and upload to the Azure Container Registry for use by the pool. The following code would do the trick. Make sure to reference the correct registry name."
"Your Dockerfile can be the same Dockerfile for running your code in a container locally. See the [Docker Docs](https://docs.docker.com/) for help getting started with Docker. You can also find an example python Dockerfile [here](./Dockerfile)."
]
},
{
Expand All @@ -183,7 +203,8 @@
"container_name = cc.package_and_upload_dockerfile(\n",
" registry_name = \"my_azure_registry\",\n",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only value for registry_name that works for me is "cfprdbatchcr". Using the value in this example for the walkthrough results in the following error:

ERROR: Registry names may contain only alpha numeric characters and must be between 5 and 50 characters
The push refers to repository [my_azure_registry.azurecr.io/cloudops-demo]
Get "https:/v2/": http: no Host in request URL

Adjusting based on this error to use - instead of _ results in the following error:

WARNING: The resource with name 'my-azure-registry' and type 'Microsoft.ContainerRegistry/registries' could not be found in subscription 'EXT-EDAV-CFA-PRD (ef340bd6-2809-4635-b18b-7e6583a8803b)'.Using 'my-azure-registry.azurecr.io' as the default registry login server.

ERROR: Could not connect to the registry login server 'my-azure-registry.azurecr.io'. Please verify that the registry exists and the URL 'https://my-azure-registry.azurecr.io/v2/' is reachable from your environment.Try running 'az acr check-health -n my-azure-registry --yes' to diagnose this issue.
The push refers to repository [my-azure-registry.azurecr.io/cloudops-demo]
Get "https://my-azure-registry.azurecr.io/v2/": dial tcp: lookup my-azure-registry.azurecr.io on 127.0.0.53:53: no such host

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provide users the correct value that will work for them or if that is not the same between users provide them a way to determine which value works for them. If a value is meant to be a stand-in that will prompt users to find their appropriate value instead of taken literally, this needs to be made clear in the text above or in the code comments with a detailed description of how to find the appropriate value

" repo_name = \"simple_test\",\n",
" tag = \"latest\"\n",
" tag = \"latest\",\n",
" path_to_dockerfile = \"./Dockerfile\" #this line only needed if Dockerfile not at root of working directory\n",
")"
]
},
Expand All @@ -194,7 +215,13 @@
"source": [
"## Create a Pool\n",
"\n",
"Pools are usually created for each team or per project. It spins up nodes when necessary based on the container you specify. The following would create a pool based on the Docker image we just uploaded, autoscaling to 5 nodes, mounting to the 'input-test' container we uploaded to, an 8 core CPU, and call it 'getting-started-pool'. "
"Pools are usually created for each team or per project. It spins up nodes when necessary based on the container you specify. \n",
"\n",
"It's at this point we specify which Blob Containers we mount to the pool. This will make blobs in Blob Storage accessible to read or write for the containers that we mount. The mounts are then accessible in your code at the root of the node, i.e. a mounted container called 'input-test' will be accessible in your code via `/input-test`. \n",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Access to mounted files continues to be an issue for my team. It is unclear as of yet what is going on, but in some cases, tasks fail access the files in the blob container mounted to the pool

"\n",
"The following would create a pool based on the Docker image we just uploaded, autoscaling to 5 nodes, mounting to the 'input-test' container to which we uploaded, use an 8 core CPU, and call it 'getting-started-pool'. \n",
"\n",
"You could also specify vm_size from a list of xsmall, small, medium, large, and xlarge. These will use 2, 4, 8, 16, or 32 cores, respectively."
Comment thread
ryanraaschCDC marked this conversation as resolved.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add more information about what the various arguments for create_pool do? In the walkthrough you have max_autoscale_nodes=5. But when I tried this out my job did not seem to autoscale like it used to with out old cfa-azure pipeline. Does autoscale need to be set to true? Whats going on?

]
},
{
Expand Down Expand Up @@ -224,7 +251,7 @@
"source": [
"## Create a Job\n",
"\n",
"Now we can create a job to run our set of tasks. Let's call it 'getting-started-job'."
"Now we can create a job to run our set of tasks. Let's call it 'getting-started-job'. Jobs are meant to capture all the tasks for one goal. For example, if we wanted to run a model for each state then compile the outputs, this whole process would make up one job. Each model and compilation would be an individual task in the job."
]
},
{
Expand Down Expand Up @@ -252,7 +279,9 @@
"source": [
"## Add Tasks to Job\n",
"\n",
"At this point we are ready to add tasks to the job we created. We can run the `main.py` python script that we uploaded to the 'input-test' container. It takes an argument called '--user' and prints a welcome message to the console. We will add two tasks to our job for two different users. In general, any number of tasks can be added to a job."
"At this point we are ready to add tasks to the job we created. We can run the `main.py` python script that we uploaded to the 'input-test' container. It takes an argument called '--user' and prints a welcome message to the console. We will add two tasks to our job for two different users. In general, any number of tasks can be added to a job.\n",
"\n",
"Notice that we reference the mounted Blob Container with /input-test (the leading / is important)."
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems there may be additional issues related to the pathing in this line besides the importance of the leading /

Fatal error: cannot open file '/input-test-edp/cloudops_helloworld.r': No such file or directory

]
},
{
Expand Down
1 change: 0 additions & 1 deletion docs/files/sp_sample.env
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
AZURE_TENANT_ID="your azure tenant id"
AZURE_SUBSCRIPTION_ID="your subscription id"
AZURE_CLIENT_ID="your azure service principal client id"
AZURE_SP_CLIENT_ID="your azure service principal client id"
AZURE_CLIENT_SECRET="your client secret" #pragma: allowlist secret

# Azure account info
Expand Down
2 changes: 2 additions & 0 deletions docs/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,5 @@ There are several components of this repo that provide benefits to developers in
- CloudClient object for easy interaction with the cloud
- more info found [here](./CloudClient/index.md)
- automation component to run jobs/tasks from a configuration file
- ContainerAppClient object for easy interaction with Container App Jobs
Comment thread
ryanraaschCDC marked this conversation as resolved.
- more info found [here](./ContainerAppClient/index.md)
11 changes: 11 additions & 0 deletions docs/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,14 @@
The default authentication method for the `CloudClient` is a Managed Identity. If your Managed Identity on your VM is not setup at all or not setup correctly, you will experience issues authenticating.

Solution: confirm your VM has the right Managed Identity setup for the Azure environment. If working at CFA, please reach out to the CFA Tools Teams. An easy way to check your Managed Identity is to run `az login --identity` in your terminal.


### Error Instantiating CloudClient

If you experience errors when creating an instance of `CloudClient()` using a .env file, it's possible the issue is coming from the .env file itself. Make sure the keys in your .env file match exactly with the keys in the sample .env. If all keys are present, it's likely an issue with a value in the .env. Confirm all values are correct.

### File Not Found During Job

If you are interacting with files during a job and getting errors like a file is not found, it can be originating from two places:
1. incorrect mount reference. The blob container should be mounted during pool creation and referenced at the root of the Docker container. For example, a container called `my-container` would be referenced as `/my-container` in code, unless you provided a relative mount path when creating the pool.
2. file not present in container. If you are referencing a file that should exist in your Docker container, confirm the path where it exists. Note that Docker sets a working directory so any relative paths will start from the working directory specified in your Docker container.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two of my teammates are having issues troubleshooting a similar issue and have looked into both of these suggestions as well as a number of other avenues to debug and gotten nowhere:

This is the stdout after including a pwd and ls in the add_task line in addition to running the helloworld script:

/
bin
boot
dev
etc
home
lib
lib32
lib64
libx32
media
mnt
opt
proc
rocker_scripts
root
run
sbin
srv
sys
tmp
usr
var
Fatal error: cannot open file '/input-test-edp/cloudops_helloworld.r': No such file or directory

Loading