Skip to content

Getting Started

Arno Hartholt edited this page May 8, 2026 · 9 revisions

Overview

To try out the VHToolkit executable versions, check out Releases or try the WebGL demo.

To start development:

Detailed instructions are below.

Requirements

Hardware and Software Requirements

  • Unity version: 6000.1.5f1
  • Unity Editor system requirements
  • Development environment:
    • Main development OS: Windows (MacOS and Linux possible, but less tested)
    • IDE: MS Visual Studio 2022
    • Supporting tools: Git; required for certain Unity packages
    • Lip sync generation for pre-recorded audio: FaceFX (requires separately license)

Unity Download and Installation

  • Download and install the Unity Hub
  • Run the Unity Hub and log in with a Unity account
  • Note there's no need to download a Unity version yet

AI Cloud Services and API Keys

The VHToolkit uses AI Cloud services. The provided executable releases can be run directly out of the box. For the Unity projects, you'll need to create and enter your own API keys in the configuration file. For the sample project, the main required services are:

  • OpenAI for speech recognition and natural language processing. In the Organization Settings of the API Platform, select the API keys menu to create your key.
  • ElevenLabs for text-to-speech synthesis. Set the API keys in the Developer section.

Alternatives:

To add your own API keys to the VHToolkit, in Unity, while running:

  • Use the Debug menu at the top left and go to the Config submenu
    • image
  • Click Open Folder Location and open the ride.json file
  • Add your keys to the appropriate sections, e.g.:
    • elevenLabs
    • openAIChatGPT
    • openAIRealtime
  • Save the file
  • Restart the Unity project

Note that alternatively, AI services can be run as local end-points for the VHToolkit to connect to.

Organization

Unity Projects

The VHToolkit consists of a Unity application that includes nonverbal behavior generation, and nonverbal behavior realization, together with integrations to audio-visual sensing, speech recognition, natural language processing, and text-to-speech services. It comes with two sample Unity projects, for each of its main rendering pipelines:

  • VHUnityURP (Universal Rendering Pipeline): lower fidelity graphics that run on all platforms, including mobile and the web
  • VHUnityHDRP (High Definition Rendering Pipeline): higher fidelity graphics that mainly run on desktops

Choose whichever project best fits your hardware platform. Both projects are feature par, only differing in the Unity rendering pipeline used. For details on downloading and installing Unity, see below.

Note that all characters art assets are in separate Assets Unity projects, for URP and HDRP, where they are built as asset bundles that the main VHToolkit project automatically downloads. This keeps the VHToolkit executable small in size.

RIDE Packages

The VHToolkit is powered by the RIDE platform, which is distributed through packages. These packages can be used independently for Unity-based AI development, with native integrations with AWS, Anthropic, Azure, ElevenLabs, OpenAI, vLLM, and Stability AI, among others.

The VHToolkit Unity sample projects use the following packages:

  • RIDE.Cognition: contains interfaces, implementations, and samples for audio-visual sensing, speech recognition, natural language processing, and text-to-speech.
  • RIDE.VH: contains the interface, implementation, and sample for nonverbal behavior generation and realization.

Both packages automatically pull in dependent packages, including RIDE.Abstract and RIDE.Core, which contain the main RIDE API and implementations for core functionality, including logging, configuration, web service interfaces, etc.

Getting Up and Running

Loading the Unity Project

  • Choose a project folder per the preferred rendering pipeline, either VHUnityHDRP or VHUnityURP
  • In the main project folder, run 'runUnity.bat' for Windows, or 'runUnity.sh' for MacOS
    • Note that this automatically downloads the proper version of Unity if it has not yet been installed
    • Note that for MacOS, this requires Mono. To install Mono, first install Brew, then from a terminal type 'brew install mono'
  • Open and Play the main scene, Assets/Scenes/SampleScene

Navigating the SampleScene

  • Click on any of the user-facing UI elements to interact with the character.
  • Further guidance is displayed in the initial Overview menu of the Debug Menu in the upper left of the screen.
    • image
  • Advance to the Main submenu and select ICT-made characters, Microsoft Rocketbox characters, or Realluision Character Creator characters (requires license).
    • image
  • Use the various Sensing, ASR, NLP, and TTS capability buttons to switch and compare different services.
  • Navigate to the other submenus to explore more debug functionalities.

Development

Unity Project Organization

  • The Unity Project is organized by:
    • Systems: all RIDE systems (ASR, NLP, TTS, etc.), the DemoController (main logic for the demo), and DebugMenus
    • Environment: main camera, environment art assets, and lighting
    • Characters: all characters and their configurations
    • VHCanvas: all UI elements
  • The RIDE.Cognition and RIDE.VH packages include all the AI-related systems, organized by area. These systems can typically be configured in the Editor:
    • image
  • The DemoController organizes and configures the RIDE AI and Debug systems, and orchestrates Sensing, ASR, NLP, TTS, and NVB services to facilitate the conversation between user and character. DemoControllerBase.cs contains most initialization and conversational flow. DemoController.cs is a derived class for desktop, mobile, and web. DemoControllerAR.cs is the derived class for the Quest AR demo. SelectCharacter(string characterName) is the primary function in these classes.
  • Most demo-specific C# scripts and prefabs are in \VHShared, for easier re-use between projects.

Character Setup

  • Character art assets are saved in one of the Assets projects as Unity Humanoids, with an avatar definition and animation controller with associated gesture animation suite. ICT and RocketBox characters are provided fully. Character Creator (CC) characters require a license. Note that only a subset of Rocketbox characters are configured; they act as examples for the full library.
  • In the main VHToolkit project, these generic characters are turned into virtual humans by adding a suite of scripts as components to a character-specific game object.
    • image
  • Some of the main components are:
    • RideCatalogAsset: configuration to download the proper asset bundle created in the dedicated Assets project. These asset bundles are created per character, per URP and HDRP, and per hardware platform. This decoupling results in smaller executable sizes and the ability to update characters without having to recompile the application. The trade-off is that it requires more management and an AWS S3 setup. To use characters directly in a VHToolkit application, export it as a Unity package from the Assets project, and then import it into the VHToolkit project. It can then be set up with all the VH components discussed here.
    • VHCharacterProfile: TTS voices, NVBG configuration, and LLM prompt.
    • MecanimCharacter: Enables characters to realize common VH behaviors, including talking, gesturing, and gazing. Starting Posture needs to match the Idle Posture ID in the NVBG configuration.
    • FacialAnimationPlayer: defines visemes and facial expressions, and their weights. There are different versions for keyframe (ICT) and blendshape animations (CC, Rocketbox). CC characters in particular require an extensive mapping from their blendshapes to the data formats the VHToolkit uses. If lipsynching looks odd or if facial expressions are missing, take a look here.
    • BMLEventHandler: parses BML (Behavior Markup Language), which drives the character's behaviors. Requires a reference to the MecanimManager game object, which is part of the RIDE.VH package.
    • EyelidController: enables blinking. Requires reference to the character game object.

Build Profiles

  • The default Build Profile is for Windows, and includes only the SampleScene. VH character asset bundles will download at runtime, similar to when playing in-editor.
    • Note, if generating a local standalone build, initial compile may take 1.5+ hours due to HDRP/URP shader compilation. The subsequent iterative compile times will be much quicker.

Development Approach

The VHToolkit demos have been designed and developed to provide a broad range of examples on how to set up any character, for any platform, driven by any technology. To develop your own applications, we encourage you to use the existing projects as a starting point and modify it to your needs. This includes the following:

  1. Decide on your main hardware platform, which drives the decision between URP and HDRP, character fidelity limitations, interaction paradigm (e.g., mouse, finger, AR/VR controller), and which AI services can be used.
  2. Decide on the character, either one of the provided ones, or one you create yourself.
  3. Provide domain knowledge for the character, either by changing the LLM prompt, using a scripted agent, or adding more fine-grained control in the DemoController C# script.
  4. Choose an environment and replace the provided one with say assets purchased from the Asset Store.
  5. Optionally, add your own desired technology as an alternative to the provided ones. Sensing, ASR, NLP, TTS, and NVB all have their principled API and C# Interfaces, which can be implemented and extended.
    1. For example, SensingSystemAWSRekognition is the AWS Rekognition implementation of the generic SensingSystemUnity, which in turn implements the ISensingEmotionSystem, ISensingHeadSystem, and ISensingCharacteristicsSystem C# interfaces. The API provides the data structures that the rest of the VHToolkit can use, in this case for example by the MirroringController. SensingSystemDeepFace is an alternative implementation. You can add your own, extend the data structure, or extend the API to provide more functionality. Start with one of the provided implementations in the area you are interested in, duplicate it, and adapt it to your new technology.

Clone this wiki locally