Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable model custom dependency installation using virtual environment #2910

Merged
merged 18 commits into from
Feb 9, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;
Expand Down Expand Up @@ -77,12 +79,25 @@ public static String getPythonRunTime(Model model) {
Manifest.RuntimeType runtime = model.getModelArchive().getManifest().getRuntime();
if (runtime == Manifest.RuntimeType.PYTHON) {
pythonRuntime = configManager.getPythonExecutable();
Path pythonVenvRuntime = Paths.get(getPythonVenvPath(model), "bin", "python");
if (Files.exists(pythonVenvRuntime)) {
pythonRuntime = pythonVenvRuntime.toString();
}
} else {
pythonRuntime = runtime.getValue();
}
return pythonRuntime;
}

public static String getPythonVenvPath(Model model) {
File modelDir = model.getModelDir();
if (Files.isSymbolicLink(modelDir.toPath())) {
modelDir = modelDir.getParentFile();
}
Path venvPath = Paths.get(modelDir.getAbsolutePath(), "venv");
return venvPath.toString();
}

public static String[] getCppEnvString(String libPath) {
ArrayList<String> envList = new ArrayList<>();
StringBuilder cppPath = new StringBuilder();
Expand Down
139 changes: 95 additions & 44 deletions frontend/server/src/main/java/org/pytorch/serve/wlm/ModelManager.java
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
Expand Down Expand Up @@ -205,87 +204,139 @@ private ModelArchive createModelArchive(
return archive;
}

private void setupModelVenv(Model model)
throws IOException, InterruptedException, ModelException {
String venvPath = EnvironmentUtils.getPythonVenvPath(model);
List<String> commandParts = new ArrayList<>();
commandParts.add(EnvironmentUtils.getPythonRunTime(model));
commandParts.add("-m");
commandParts.add("venv");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should you make this name customizable? Part of the appeal of this PR is different workers should be able to have different virtual environments

Copy link
Collaborator Author

@namannandan namannandan Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this PR creates a virtual environment on a per model basis at model load time. All workers for a given model use the same virtual environment. This replaces installing dependencies on a per model basis in a target directory and is backwards compatible with the existing behavior with no change to customer experience. Although the same name venv is used, they are located within the individual model directories, for ex: /tmp/models/test-model/venv.
Would it be useful to extend this implementation to support separate venv per worker?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No the isolation is fine as is I think

commandParts.add("--clear");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why clear? Seems beneficial to allow users to install their dependencies beforehand

It was always quite weird how we pip installed a bunch of stuff on launching a model

Copy link
Collaborator Author

@namannandan namannandan Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, will check if venvs can safely be made portable, since the symlinks in the bin directory of the venv, for ex: venv/bin/python may need to be updated for them to work, since the python binary path may not be the same on the host on which the venv is created and the host on which the venv is used.

From the official docs: https://docs.python.org/3/library/venv.html
Warning: Because scripts installed in environments should not expect the environment to be activated, their shebang lines contain the absolute paths to their environment’s interpreters. Because of this, environments are inherently non-portable, in the general case. You should always have a simple means of recreating an environment (for example, if you have a requirements file requirements.txt, you can invoke pip install -r requirements.txt using the environment’s pip to install all of the packages needed by the environment).

commandParts.add("--system-site-packages");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why system site?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to make sure that the venv can see packages that are already installed in the base python environment and not have to install them again. If a newer version of an existing package is required or a non existing package is required, they will get installed to the venv site-packages and will take precedence over system-site-packages. This does not affect the base python environment. I've added more details with an example in the PR description: #2910 (comment)

commandParts.add(venvPath);

ProcessBuilder processBuilder = new ProcessBuilder(commandParts);

if (isValidVenvPath(venvPath)) {
processBuilder.directory(Paths.get(venvPath).toFile().getParentFile());
} else {
throw new ModelException(
"Invalid python venv path for model " + model.getModelName() + ": " + venvPath);
}
Map<String, String> environment = processBuilder.environment();
String[] envp =
EnvironmentUtils.getEnvString(
configManager.getModelServerHome(),
model.getModelDir().getAbsolutePath(),
null);
for (String envVar : envp) {
String[] parts = envVar.split("=", 2);
if (parts.length == 2) {
environment.put(parts[0], parts[1]);
}
}
processBuilder.redirectErrorStream(true);

Process process = processBuilder.start();

int exitCode = process.waitFor();
String line;
StringBuilder outputString = new StringBuilder();
BufferedReader brdr = new BufferedReader(new InputStreamReader(process.getInputStream()));
while ((line = brdr.readLine()) != null) {
outputString.append(line + "\n");
}

if (exitCode == 0) {
logger.info(
"Created virtual environment for model {}: {}", model.getModelName(), venvPath);
} else {
logger.error(
"Virtual environment creation for model {} at {} failed:\n{}",
model.getModelName(),
venvPath,
outputString.toString());
throw new ModelException(
"Virtual environment creation failed for model " + model.getModelName());
}
}

private void setupModelDependencies(Model model)
throws IOException, InterruptedException, ModelException {
String requirementsFile =
model.getModelArchive().getManifest().getModel().getRequirementsFile();

if (configManager.getInstallPyDepPerModel() && requirementsFile != null) {
Path requirementsFilePath =
Paths.get(model.getModelDir().getAbsolutePath(), requirementsFile);

setupModelVenv(model);
String pythonRuntime = EnvironmentUtils.getPythonRunTime(model);

File dependencyPath = model.getModelDir();
if (Files.isSymbolicLink(dependencyPath.toPath())) {
dependencyPath = dependencyPath.getParentFile();
if (!isValidVenvPath(pythonRuntime)) {
throw new ModelException(
"Invalid python venv runtime path for model "
+ model.getModelName()
+ ": "
+ pythonRuntime);
}

Path requirementsFilePath =
Paths.get(model.getModelDir().getAbsolutePath(), requirementsFile);
List<String> commandParts = new ArrayList<>();

commandParts.add(pythonRuntime);
commandParts.add("-m");
commandParts.add("pip");
commandParts.add("install");
commandParts.add("-U");
commandParts.add("-t");
commandParts.add(dependencyPath.getAbsolutePath());
commandParts.add("--upgrade-strategy");
commandParts.add("only-if-needed");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good change but do you mind just telling me why you needed to make it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In prior versions of pip (i.e pip<10.0) when pip install -U -r requirements.txt is used, all the packages listed in requirements.txt and their dependencies are upgraded since by default the --upgrade-strategy was eager. --upgrade-strategy applies to the handling of dependencies of the packages specified in requirements.txt. In pip>=10.0, the default --upgrade-stragegy is only-if-needed. This change is to explicitly make the --upgrade-strategy as only-if-needed irrespective of pip version.

From the pip docs: https://pip.pypa.io/en/stable/user_guide/#only-if-needed-recursive-upgrade

commandParts.add("-r");
commandParts.add(requirementsFilePath.toString());

ProcessBuilder processBuilder = new ProcessBuilder(commandParts);

String[] envp =
EnvironmentUtils.getEnvString(
configManager.getModelServerHome(),
model.getModelDir().getAbsolutePath(),
null);

ProcessBuilder processBuilder = new ProcessBuilder(commandParts);
if (isValidDependencyPath(dependencyPath)) {
processBuilder.directory(dependencyPath);
} else {
throw new ModelException(
"Invalid 3rd party package installation path "
+ dependencyPath.getCanonicalPath());
}

Map<String, String> environment = processBuilder.environment();
for (String envVar : envp) {
String[] parts = envVar.split("=", 2);
if (parts.length == 2) {
environment.put(parts[0], parts[1]);
}
}
Process process = processBuilder.start();
int exitCode = process.waitFor();

if (exitCode != 0) {
processBuilder.directory(
Paths.get(EnvironmentUtils.getPythonVenvPath(model)).toFile().getParentFile());
processBuilder.redirectErrorStream(true);

String line;
StringBuilder outputString = new StringBuilder();
// process's stdout is InputStream for caller process
BufferedReader brdr =
new BufferedReader(new InputStreamReader(process.getInputStream()));
while ((line = brdr.readLine()) != null) {
outputString.append(line);
}
StringBuilder errorString = new StringBuilder();
// process's stderr is ErrorStream for caller process
brdr = new BufferedReader(new InputStreamReader(process.getErrorStream()));
while ((line = brdr.readLine()) != null) {
errorString.append(line);
}
Process process = processBuilder.start();

logger.error("Dependency installation stderr:\n" + errorString.toString());
int exitCode = process.waitFor();
String line;
StringBuilder outputString = new StringBuilder();
BufferedReader brdr =
new BufferedReader(new InputStreamReader(process.getInputStream()));
while ((line = brdr.readLine()) != null) {
outputString.append(line + "\n");
}

if (exitCode == 0) {
logger.info(
"Installed custom pip packages for model {}:\n{}",
model.getModelName(),
outputString.toString());
} else {
logger.error(
"Custom pip package installation failed for model {}:\n{}",
model.getModelName(),
outputString.toString());
throw new ModelException(
"Custom pip package installation failed for " + model.getModelName());
"Custom pip package installation failed for model " + model.getModelName());
}
}
}

private boolean isValidDependencyPath(File dependencyPath) {
if (dependencyPath
.toPath()
private boolean isValidVenvPath(String venvPath) {
if (Paths.get(venvPath)
.normalize()
.startsWith(FileUtils.getTempDirectory().toPath().normalize())) {
return true;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1169,7 +1169,7 @@ public void testModelWithInvalidCustomPythonDependency()
Assert.assertEquals(TestUtils.getHttpStatus(), HttpResponseStatus.BAD_REQUEST);
Assert.assertEquals(
resp.getMessage(),
"Custom pip package installation failed for custom_invalid_python_dep");
"Custom pip package installation failed for model custom_invalid_python_dep");
TestUtils.setConfiguration(configManager, "install_py_dep_per_model", "false");
channel.close().sync();
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -424,7 +424,7 @@ public void testWorkflowWithInvalidCustomPythonDependencyModel()
Assert.assertEquals(TestUtils.getHttpStatus(), HttpResponseStatus.INTERNAL_SERVER_ERROR);
Assert.assertEquals(
resp.getMessage(),
"Workflow custom_invalid_python_dep has failed to register. Failures: [Workflow Node custom_invalid_python_dep__custom_invalid_python_dep failed to register. Details: Custom pip package installation failed for custom_invalid_python_dep__custom_invalid_python_dep]");
"Workflow custom_invalid_python_dep has failed to register. Failures: [Workflow Node custom_invalid_python_dep__custom_invalid_python_dep failed to register. Details: Custom pip package installation failed for model custom_invalid_python_dep__custom_invalid_python_dep]");
TestUtils.setConfiguration(configManager, "install_py_dep_per_model", "false");
channel.close().sync();
}
Expand Down