COMP7606 Enhance AI Agent (OS-Copilot)

The project is to enhance functionality of the AI agent copilot. The AI agent copilot is a based on OS-Copilot (GitHub) that can help the user to do some tasks.

On my part, I have enhanced the base prompt and added some new features. Such as support Windows environments with Batch file excuation, previous chat history memory, previous sub-task memory and some new tools.

Updaing Prompt to support previous Task and History

PROMPT = {
    "_USER_WINDOWS_GENERATE_PROMPT": "
        User's information is as follows:
        System Version: {system_version}  # System Information, Windows / Linux / Mac etc.
        System language: {system_language}
        Working Directory: {working_dir}
        Task Name: {task_name}
        Task Description: {task_description}
        Information of Prerequisite Tasks: {pre_tasks_info}    # Previous History
        Code Type: {Type}"
}

PROMPT = {
    # ! shell for Windows (Batch)
    "_SYSTEM_WINDOWS_GENERATE_PROMPT": "
    You are a world-class programmer that can complete any task by executing code, your goal is to generate the corresponding code based on the type of code to complete the task.
    You could only respond with a code.
    Windows Batch File code output Format:
    ```batch
    batch code
    ```

    The code you write should follow the following criteria:
    1. You must generate code of the specified 'Code Type' to complete the task.
    2. The code logic should be clear and highly readable, able to meet the requirements of the task.
    "
}

Different Environment Handling

# env.py
class Env(BaseEnv):
     def __init__(self):
        super().__init__()
        self.languages = [
            PythonJupyterEnv,
            Shell,
            Batch,  # New Batch for Windows
            AppleScript,
        ]
        self._active_languages = {}
    def step(self, language, code, stream=False, display=False):
        ...
        if os.name == "nt":  # Windows
            # * Command to list the files in the current directory
            cmd = ["cmd", "/c", "dir"]
        else:  # Unix/Linux
            # * Command to list the files in the current directory
            cmd = ["ls"]

class Batch(SubprocessEnv):
    "
    A class representing the Windows Command Prompt environment for executing .bat scripts.

    Fully follow the structure of the Shell class
    But did not contain mainline handle, as it is not needed for the batch script

    We will save the batch in a temp.bat and run it directly
    We do not send command line by line as Shell command
    "

    file_extension = "bat"
    name = "Batch"
    aliases = ["bat", "cmd"]

    def __init__(self):
        "
        Initializes the WindowsShell environment.

        Uses cmd.exe for running Windows shell scripts (.bat).
        "
        super().__init__()
        self.start_cmd = ["cmd.exe", "/c"]  # "/c" will execute and then close cmd

    def run_bat_file(self, code):
        task_complete = {
            "status": False,
            "content": "",
        }

        try:
            # Step 0: Preprocess the code
            # Adding the end of execution marker
            code = self.preprocess_code(code)

            # Step 1: Create a temporary .bat file
            with open("temp.bat", "w") as f:
                f.write(code)

            # Step 2: Run the .bat script using subprocess.Popen
            process = subprocess.Popen(
                ["cmd.exe", "/c", "temp.bat"],
                stdout=subprocess.PIPE,
                stderr=subprocess.PIPE,
                text=True,  # Capture the output as text, not bytes
            )

            # Step 3: Monitor the output for the '##end_of_execution##' marker
            while True:
                output = process.stdout.readline()  # Read one line of output at a time

                # If the process ends and no more output is available, break the loop
                if output == "" and process.poll() is not None:
                    break

                # Print output for debugging purposes
                if output:
                    task_complete["content"] += output
                    print(f"OUTPUT: {output.strip()}")

                # Check if the end-of-execution marker is found
                if "##end_of_execution##" in output:
                    task_complete["status"] = True
                    print("Task completed successfully!")
                    break

                # Sleep for a bit to avoid busy-waiting
                time.sleep(0.1)

            # Step 4: Check for any remaining stderr output
            # If  "##end_of_execution##" is not found, the task is not complete (Error)
            # No need to change the status, as default is False
            error_output = process.stderr.read()
            if error_output:
                print(f"ERROR: {error_output.strip()}")

            # Step 5: Ensure the process has completed
            process.wait()

            # Step 6: Cleanup the temporary file
            try:
                os.remove("temp.bat")
            except Exception as e:
                print(f"Error deleting temp.bat: {e}")

            return task_complete
        except Exception as e:
            print(f"Error running batch script: {e}")
            task_complete["content"] = f"Error running batch script: {e}"

            # * The Error Batch is still in the temp.bat, so it can be debugged

            return task_complete

    def preprocess_code(self, code):
        "
        Preprocesses the batch script code before execution.

        Since batch scripts don't need as much preprocessing, this can be minimal.
        "
        return preprocess_bat(code)

    def line_postprocessor(self, line):
        "
        Postprocesses each line of output from the batch script execution.

        Args:
            line (str): A line from the output of the batch script execution.

        Returns:
            str: The processed line.
        "
        return line

    def detect_active_line(self, line):
        "
        Batch scripts don't have a specific active line marker, but we could add one.
        For now, return None.
        "
        return None

    def detect_end_of_execution(self, line):
        "
        Detects the end of execution marker in the output.

        Args:
            line (str): A line from the output.

        Returns:
            bool: True if the end of execution marker is found, False otherwise.
        "
        return "##end_of_execution##" in line

This is a group project, and the project report is not available for public viewing. Original OS-Copilot repo can be found in GitHub.