diff --git a/README.md b/README.md index 20ee451..8988fed 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,116 @@ Vulkan Grass Rendering ================================== -**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 5** +**University of Pennsylvania, CIS 5650: GPU Programming and Architecture, Project 5** -* (TODO) YOUR NAME HERE -* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab) +* Michael Rabbitz + * [LinkedIn](https://www.linkedin.com/in/mike-rabbitz) +* Tested on: Windows 10, i7-9750H @ 2.60GHz 32GB, RTX 2060 6GB (Personal) -### (TODO: Your README) +![](img/grass.gif) -*DO NOT* leave the README to the last minute! It is a crucial part of the -project, and we will not be able to grade you without a good README. +## Part 1: Introduction + +This project is an implementation of techniques described in [Responsive Real-Time Grass Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf), using Vulkan to build an efficient grass simulator and renderer. Grass blades are represented as Bezier curves, with a compute shader handling physics and culling processes, while graphics shaders manage rendering. The goal is to achieve realistic, performance-efficient grass rendering suitable for real-time applications. + +The base code includes a basic Vulkan setup with a compute pipeline and two graphics pipelines. This implementation focuses on developing shaders for the grass compute and graphics pipelines, along with custom descriptor bindings necessary to manage data between these pipelines. + +## Part 2: Simulating Forces +To create realistic movement, we simulate environmental forces on each grass blade using a compute shader. + +Before we get into the simulated forces, here is an image showing the static, initial setup of the grass blades with no forces applied. +![](img/no_forces.PNG) + +### Gravity +The first simulated force is gravity, which is an application of the Earth's downward gravitational force at 9.81 m/s2. As we can see, without a counter force, the blades are flattened to the ground plane. + +|Gravity| +|:--:| +|![](img/gravity.PNG) | + +### Recovery +Next, we add the second simulated force, recovery, which counteracts gravity and returns the blades to equilibrium, following Hooke's law as derived in the paper. + +|Gravity + Recovery| +|:--:| +|![](img/gravity_recovery.PNG) | + +### Wind +Finally, we add the third simulated force, wind, by implementing a custom wind function to influence the grass, considering the alignment of the blades with the wind direction. + +The arbitrary wind function we use is: WIND_INTENSITY * vec3(cos(totalTime), 0.0, sin(totalTime)) * directional_alignment * height_ratio + +|Gravity + Recovery + Wind| +|:--:| +|![](img/grass.gif) | + +## Part 3: Culling Tests +To optimize performance, various culling methods are implemented to avoid rendering blades that don’t contribute to the final image: + +### Orientation Culling +In this technique, blades near-perpendicular or perpendicular to the camera are culled to prevent rendering artifacts. + +![](img/orientation_culling.gif) + +### View-Frustum Culling +In this technique, blades outside the view-frustum are excluded based on visibility tests for each Bezier curve. + +![](img/frustrum_culling.gif) + +### Distance Culling +In this technique, blades beyond a certain distance are culled in buckets, with more distant blades culled more aggressively. + +![](img/distance_culling.gif) + +## Part 4: Performance Analysis +- Frames Per Second (FPS) is the measurment of performance in this section. FPS is measured using a GLFW timer within the main loop. +- The Test Scene is positioned to render many grass blades as we increase the count, and to apply all three culling options. + + +|Test Scene| +|:--:| +|![](img/test_scene.PNG) | + +### Runtime vs Blade Count +- Culling ON refers to when all three culling options are enabled. +- Culling OFF refers to when all three culling options are disabled. + + +![](img/runtime_blade_count.png) + +| Blade Count | Culling OFF (FPS) | Culling ON (FPS) | +| ------------- | ----------------- | ----------------- | +|210 |1235 |1245 | +|212 |1145 |1230 | +|214 |565 |1130 | +|216 |195 |545 | +|218 |62 |180 | +|220 |17 |60 | +|222 |4 |16 | +|224 |1 |4 | + +**Observations** +- **Trend:** As the blade count increases, FPS decreases significantly in both cases, but with culling enabled, FPS remains higher. +- **Culling Efficiency:** At higher blade counts, culling's impact on performance becomes much more noticeable, maintaining playable FPS even as the blade count reaches 220. +- **Culling Overhead:** There’s minimal overhead for culling at lower blade counts, as FPS differences between Culling ON and OFF remain small. + +### Runtime vs Culling Options +- Blade Count is 216 for the following tests. + + +![](img/runtime_culling.png) + +| Culling Option(s) | FPS | +| -------------------------- | --- | +|Culling OFF |195 | +|Orientation |225 | +|View Frustrum |200 | +|Distance |500 | +|Orientation + View Frustrum |215 | +|Orientation + Distance |530 | +|View Frustrum + Distance |505 | +|All Culling |545 | + +**Observations** +- **Distance Culling Effectiveness:** Distance culling alone boosts FPS by over 2x, suggesting that rendering fewer distant blades is the most impactful optimization. +- **All Culling:** Using all three techniques provides a slight additional FPS boost over distance culling alone, though the improvement is minor, indicating diminishing returns with combined techniques. diff --git a/bin/Release/vulkan_grass_rendering.exe b/bin/Release/vulkan_grass_rendering.exe index f68db3a..9b96974 100644 Binary files a/bin/Release/vulkan_grass_rendering.exe and b/bin/Release/vulkan_grass_rendering.exe differ diff --git a/img/distance_culling.gif b/img/distance_culling.gif new file mode 100644 index 0000000..bd0c5e3 Binary files /dev/null and b/img/distance_culling.gif differ diff --git a/img/frustrum_culling.gif b/img/frustrum_culling.gif new file mode 100644 index 0000000..e5a7e0a Binary files /dev/null and b/img/frustrum_culling.gif differ diff --git a/img/grass.gif b/img/grass.gif index 78f008e..8fc08a2 100644 Binary files a/img/grass.gif and b/img/grass.gif differ diff --git a/img/grass2.gif b/img/grass2.gif deleted file mode 100644 index 3f14616..0000000 Binary files a/img/grass2.gif and /dev/null differ diff --git a/img/grass_basic.gif b/img/grass_basic.gif deleted file mode 100644 index 3b04705..0000000 Binary files a/img/grass_basic.gif and /dev/null differ diff --git a/img/gravity.PNG b/img/gravity.PNG new file mode 100644 index 0000000..286b25f Binary files /dev/null and b/img/gravity.PNG differ diff --git a/img/gravity_recovery.PNG b/img/gravity_recovery.PNG new file mode 100644 index 0000000..210e794 Binary files /dev/null and b/img/gravity_recovery.PNG differ diff --git a/img/no_forces.PNG b/img/no_forces.PNG new file mode 100644 index 0000000..18e75ec Binary files /dev/null and b/img/no_forces.PNG differ diff --git a/img/orientation_culling.gif b/img/orientation_culling.gif new file mode 100644 index 0000000..983c076 Binary files /dev/null and b/img/orientation_culling.gif differ diff --git a/img/runtime_blade_count.png b/img/runtime_blade_count.png new file mode 100644 index 0000000..4dd0fa9 Binary files /dev/null and b/img/runtime_blade_count.png differ diff --git a/img/runtime_culling.png b/img/runtime_culling.png new file mode 100644 index 0000000..ccb56cf Binary files /dev/null and b/img/runtime_culling.png differ diff --git a/img/test_scene.PNG b/img/test_scene.PNG new file mode 100644 index 0000000..bf085ad Binary files /dev/null and b/img/test_scene.PNG differ diff --git a/src/Blades.cpp b/src/Blades.cpp index 80e3d76..e275dfa 100644 --- a/src/Blades.cpp +++ b/src/Blades.cpp @@ -44,8 +44,8 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode indirectDraw.firstVertex = 0; indirectDraw.firstInstance = 0; - BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, bladesBuffer, bladesBufferMemory); - BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory); + BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, bladesBuffer, bladesBufferMemory); + BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory); BufferUtils::CreateBufferFromData(device, commandPool, &indirectDraw, sizeof(BladeDrawIndirect), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT, numBladesBuffer, numBladesBufferMemory); } diff --git a/src/Blades.h b/src/Blades.h index 9bd1eed..900e921 100644 --- a/src/Blades.h +++ b/src/Blades.h @@ -4,12 +4,12 @@ #include #include "Model.h" -constexpr static unsigned int NUM_BLADES = 1 << 13; +constexpr static unsigned int NUM_BLADES = 1 << 16; constexpr static float MIN_HEIGHT = 1.3f; constexpr static float MAX_HEIGHT = 2.5f; constexpr static float MIN_WIDTH = 0.1f; constexpr static float MAX_WIDTH = 0.14f; -constexpr static float MIN_BEND = 7.0f; +constexpr static float MIN_BEND = 9.0f; constexpr static float MAX_BEND = 13.0f; struct Blade { diff --git a/src/Renderer.cpp b/src/Renderer.cpp index b445d04..03b8931 100644 --- a/src/Renderer.cpp +++ b/src/Renderer.cpp @@ -195,9 +195,43 @@ void Renderer::CreateTimeDescriptorSetLayout() { } void Renderer::CreateComputeDescriptorSetLayout() { - // TODO: Create the descriptor set layout for the compute pipeline + // DONE: Create the descriptor set layout for the compute pipeline // Remember this is like a class definition stating why types of information // will be stored at each binding + + // Describe the binding of the descriptor set layout + VkDescriptorSetLayoutBinding inputBladesLayoutBinding = {}; + inputBladesLayoutBinding.binding = 0; + inputBladesLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + inputBladesLayoutBinding.descriptorCount = 1; + inputBladesLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + inputBladesLayoutBinding.pImmutableSamplers = nullptr; + + VkDescriptorSetLayoutBinding culledBladesLayoutBinding = {}; + culledBladesLayoutBinding.binding = 1; + culledBladesLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + culledBladesLayoutBinding.descriptorCount = 1; + culledBladesLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + culledBladesLayoutBinding.pImmutableSamplers = nullptr; + + VkDescriptorSetLayoutBinding numRemainingBladesLayoutBinding = {}; + numRemainingBladesLayoutBinding.binding = 2; + numRemainingBladesLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + numRemainingBladesLayoutBinding.descriptorCount = 1; + numRemainingBladesLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + numRemainingBladesLayoutBinding.pImmutableSamplers = nullptr; + + std::vector bindings = { inputBladesLayoutBinding, culledBladesLayoutBinding, numRemainingBladesLayoutBinding }; + + // Create the descriptor set layout + VkDescriptorSetLayoutCreateInfo layoutInfo = {}; + layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; + layoutInfo.bindingCount = static_cast(bindings.size()); + layoutInfo.pBindings = bindings.data(); + + if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &computeDescriptorSetLayout) != VK_SUCCESS) { + throw std::runtime_error("Failed to create descriptor set layout"); + } } void Renderer::CreateDescriptorPool() { @@ -215,7 +249,10 @@ void Renderer::CreateDescriptorPool() { // Time (compute) { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 }, - // TODO: Add any additional types and counts of descriptors you will need to allocate + // DONE: Add any additional types and counts of descriptors you will need to allocate + + // Blades (compute) + { VK_DESCRIPTOR_TYPE_STORAGE_BUFFER , static_cast(3 * scene->GetBlades().size()) }, }; VkDescriptorPoolCreateInfo poolInfo = {}; @@ -318,8 +355,45 @@ void Renderer::CreateModelDescriptorSets() { } void Renderer::CreateGrassDescriptorSets() { - // TODO: Create Descriptor sets for the grass. + // DONE: Create Descriptor sets for the grass. // This should involve creating descriptor sets which point to the model matrix of each group of grass blades + + grassDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the desciptor set + VkDescriptorSetLayout layouts[] = { modelDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(grassDescriptorSets.size()); + allocInfo.pSetLayouts = layouts; + + // Allocate descriptor sets + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassDescriptorSets.data()) != VK_SUCCESS) { + throw std::runtime_error("Failed to allocate descriptor set"); + } + + std::vector descriptorWrites(grassDescriptorSets.size()); + + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + VkDescriptorBufferInfo grassBufferInfo = {}; + grassBufferInfo.buffer = scene->GetBlades()[i]->GetModelBuffer(); + grassBufferInfo.offset = 0; + grassBufferInfo.range = sizeof(ModelBufferObject); + + descriptorWrites[i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[i].dstSet = grassDescriptorSets[i]; + descriptorWrites[i].dstBinding = 0; + descriptorWrites[i].dstArrayElement = 0; + descriptorWrites[i].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; + descriptorWrites[i].descriptorCount = 1; + descriptorWrites[i].pBufferInfo = &grassBufferInfo; + descriptorWrites[i].pImageInfo = nullptr; + descriptorWrites[i].pTexelBufferView = nullptr; + } + + // Update descriptor sets + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); } void Renderer::CreateTimeDescriptorSet() { @@ -358,8 +432,77 @@ void Renderer::CreateTimeDescriptorSet() { } void Renderer::CreateComputeDescriptorSets() { - // TODO: Create Descriptor sets for the compute pipeline - // The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades + // DONE: Create Descriptor sets for the compute pipeline + // The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades + + computeDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the desciptor set + VkDescriptorSetLayout layouts[] = { computeDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(computeDescriptorSets.size()); + allocInfo.pSetLayouts = layouts; + + // Allocate descriptor sets + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, computeDescriptorSets.data()) != VK_SUCCESS) { + throw std::runtime_error("Failed to allocate descriptor set"); + } + + std::vector descriptorWrites(3 * computeDescriptorSets.size()); + + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + VkDescriptorBufferInfo inputBladesBufferInfo = {}; + inputBladesBufferInfo.buffer = scene->GetBlades()[i]->GetBladesBuffer(); + inputBladesBufferInfo.offset = 0; + inputBladesBufferInfo.range = sizeof(Blade) * NUM_BLADES; + + VkDescriptorBufferInfo culledBladesBufferInfo = {}; + culledBladesBufferInfo.buffer = scene->GetBlades()[i]->GetCulledBladesBuffer(); + culledBladesBufferInfo.offset = 0; + culledBladesBufferInfo.range = sizeof(Blade) * NUM_BLADES; + + VkDescriptorBufferInfo numRemainingBladesBufferInfo = {}; + numRemainingBladesBufferInfo.buffer = scene->GetBlades()[i]->GetNumBladesBuffer(); + numRemainingBladesBufferInfo.offset = 0; + numRemainingBladesBufferInfo.range = sizeof(BladeDrawIndirect); + + uint32_t descriptorWritesIdx = 3 * i; + + descriptorWrites[descriptorWritesIdx].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[descriptorWritesIdx].dstSet = computeDescriptorSets[i]; + descriptorWrites[descriptorWritesIdx].dstBinding = 0; + descriptorWrites[descriptorWritesIdx].dstArrayElement = 0; + descriptorWrites[descriptorWritesIdx].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[descriptorWritesIdx].descriptorCount = 1; + descriptorWrites[descriptorWritesIdx].pBufferInfo = &inputBladesBufferInfo; + descriptorWrites[descriptorWritesIdx].pImageInfo = nullptr; + descriptorWrites[descriptorWritesIdx++].pTexelBufferView = nullptr; + + descriptorWrites[descriptorWritesIdx].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[descriptorWritesIdx].dstSet = computeDescriptorSets[i]; + descriptorWrites[descriptorWritesIdx].dstBinding = 1; + descriptorWrites[descriptorWritesIdx].dstArrayElement = 0; + descriptorWrites[descriptorWritesIdx].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[descriptorWritesIdx].descriptorCount = 1; + descriptorWrites[descriptorWritesIdx].pBufferInfo = &culledBladesBufferInfo; + descriptorWrites[descriptorWritesIdx].pImageInfo = nullptr; + descriptorWrites[descriptorWritesIdx++].pTexelBufferView = nullptr; + + descriptorWrites[descriptorWritesIdx].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[descriptorWritesIdx].dstSet = computeDescriptorSets[i]; + descriptorWrites[descriptorWritesIdx].dstBinding = 2; + descriptorWrites[descriptorWritesIdx].dstArrayElement = 0; + descriptorWrites[descriptorWritesIdx].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[descriptorWritesIdx].descriptorCount = 1; + descriptorWrites[descriptorWritesIdx].pBufferInfo = &numRemainingBladesBufferInfo; + descriptorWrites[descriptorWritesIdx].pImageInfo = nullptr; + descriptorWrites[descriptorWritesIdx].pTexelBufferView = nullptr; + } + + // Update descriptor sets + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); } void Renderer::CreateGraphicsPipeline() { @@ -716,8 +859,8 @@ void Renderer::CreateComputePipeline() { computeShaderStageInfo.module = computeShaderModule; computeShaderStageInfo.pName = "main"; - // TODO: Add the compute dsecriptor set layout you create to this list - std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout }; + // DONE: Add the compute dsecriptor set layout you create to this list + std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout, computeDescriptorSetLayout }; // Create pipeline layout VkPipelineLayoutCreateInfo pipelineLayoutInfo = {}; @@ -883,7 +1026,11 @@ void Renderer::RecordComputeCommandBuffer() { // Bind descriptor set for time uniforms vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 1, 1, &timeDescriptorSet, 0, nullptr); - // TODO: For each group of blades bind its descriptor set and dispatch + // DONE: For each group of blades bind its descriptor set and dispatch + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 2, 1, &computeDescriptorSets[i], 0, nullptr); + vkCmdDispatch(computeCommandBuffer, (NUM_BLADES + WORKGROUP_SIZE - 1) / WORKGROUP_SIZE, 1, 1); + } // ~ End recording ~ if (vkEndCommandBuffer(computeCommandBuffer) != VK_SUCCESS) { @@ -975,14 +1122,15 @@ void Renderer::RecordCommandBuffers() { for (uint32_t j = 0; j < scene->GetBlades().size(); ++j) { VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() }; VkDeviceSize offsets[] = { 0 }; - // TODO: Uncomment this when the buffers are populated - // vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); + // DONE: Uncomment this when the buffers are populated + vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); - // TODO: Bind the descriptor set for each grass blades model + // DONE: Bind the descriptor set for each grass blades model + vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipelineLayout, 1, 1, &grassDescriptorSets[j], 0, nullptr); // Draw - // TODO: Uncomment this when the buffers are populated - // vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); + // DONE: Uncomment this when the buffers are populated + vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); } // End render pass @@ -1041,7 +1189,7 @@ void Renderer::Frame() { Renderer::~Renderer() { vkDeviceWaitIdle(logicalDevice); - // TODO: destroy any resources you created + // DONE: destroy any resources you created vkFreeCommandBuffers(logicalDevice, graphicsCommandPool, static_cast(commandBuffers.size()), commandBuffers.data()); vkFreeCommandBuffers(logicalDevice, computeCommandPool, 1, &computeCommandBuffer); @@ -1057,6 +1205,7 @@ Renderer::~Renderer() { vkDestroyDescriptorSetLayout(logicalDevice, cameraDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, modelDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, timeDescriptorSetLayout, nullptr); + vkDestroyDescriptorSetLayout(logicalDevice, computeDescriptorSetLayout, nullptr); vkDestroyDescriptorPool(logicalDevice, descriptorPool, nullptr); diff --git a/src/Renderer.h b/src/Renderer.h index 95e025f..36caa9b 100644 --- a/src/Renderer.h +++ b/src/Renderer.h @@ -56,12 +56,15 @@ class Renderer { VkDescriptorSetLayout cameraDescriptorSetLayout; VkDescriptorSetLayout modelDescriptorSetLayout; VkDescriptorSetLayout timeDescriptorSetLayout; + VkDescriptorSetLayout computeDescriptorSetLayout; VkDescriptorPool descriptorPool; VkDescriptorSet cameraDescriptorSet; std::vector modelDescriptorSets; VkDescriptorSet timeDescriptorSet; + std::vector grassDescriptorSets; + std::vector computeDescriptorSets; VkPipelineLayout graphicsPipelineLayout; VkPipelineLayout grassPipelineLayout; diff --git a/src/SwapChain.cpp b/src/SwapChain.cpp index 711fec0..70fa7b8 100644 --- a/src/SwapChain.cpp +++ b/src/SwapChain.cpp @@ -74,14 +74,17 @@ SwapChain::SwapChain(Device* device, VkSurfaceKHR vkSurface, unsigned int numBuf } } -void SwapChain::Create() { +void SwapChain::Create(int width, int height) { auto* instance = device->GetInstance(); const auto& surfaceCapabilities = instance->GetSurfaceCapabilities(); VkSurfaceFormatKHR surfaceFormat = chooseSwapSurfaceFormat(instance->GetSurfaceFormats()); VkPresentModeKHR presentMode = chooseSwapPresentMode(instance->GetPresentModes()); - VkExtent2D extent = chooseSwapExtent(surfaceCapabilities, GetGLFWWindow()); + VkExtent2D extent{ width, height }; + if (width == 0 || height == 0) { + extent = chooseSwapExtent(surfaceCapabilities, GetGLFWWindow()); + } uint32_t imageCount = surfaceCapabilities.minImageCount + 1; imageCount = numBuffers > imageCount ? numBuffers : imageCount; @@ -188,9 +191,9 @@ VkSemaphore SwapChain::GetRenderFinishedVkSemaphore() const { return renderFinishedSemaphore; } -void SwapChain::Recreate() { +void SwapChain::Recreate(int width, int height) { Destroy(); - Create(); + Create(width, height); } bool SwapChain::Acquire() { diff --git a/src/SwapChain.h b/src/SwapChain.h index dbafcf0..545c72d 100644 --- a/src/SwapChain.h +++ b/src/SwapChain.h @@ -17,14 +17,14 @@ class SwapChain { VkSemaphore GetImageAvailableVkSemaphore() const; VkSemaphore GetRenderFinishedVkSemaphore() const; - void Recreate(); + void Recreate(int width = 0, int height = 0); bool Acquire(); bool Present(); ~SwapChain(); private: SwapChain(Device* device, VkSurfaceKHR vkSurface, unsigned int numBuffers); - void Create(); + void Create(int width = 0, int height = 0); void Destroy(); Device* device; diff --git a/src/main.cpp b/src/main.cpp index 8bf822b..56f846c 100644 --- a/src/main.cpp +++ b/src/main.cpp @@ -5,6 +5,7 @@ #include "Camera.h" #include "Scene.h" #include "Image.h" +#include Device* device; SwapChain* swapChain; @@ -16,7 +17,7 @@ namespace { if (width == 0 || height == 0) return; vkDeviceWaitIdle(device->GetVkDevice()); - swapChain->Recreate(); + swapChain->Recreate(width, height); renderer->RecreateFrameResources(); } @@ -139,14 +140,36 @@ int main() { renderer = new Renderer(device, swapChain, scene, camera); - glfwSetWindowSizeCallback(GetGLFWWindow(), resizeCallback); - glfwSetMouseButtonCallback(GetGLFWWindow(), mouseDownCallback); - glfwSetCursorPosCallback(GetGLFWWindow(), mouseMoveCallback); + GLFWwindow* window = GetGLFWWindow(); + glfwSetWindowSizeCallback(window, resizeCallback); + glfwSetMouseButtonCallback(window, mouseDownCallback); + glfwSetCursorPosCallback(window, mouseMoveCallback); + + double fps = 0; + double timebase = 0; + int frame = 0; while (!ShouldQuit()) { glfwPollEvents(); + + frame++; + double time = glfwGetTime(); + + if (time - timebase > 1.0) { + fps = frame / (time - timebase); + timebase = time; + frame = 0; + } + scene->UpdateTime(); renderer->Frame(); + + std::ostringstream ss; + ss << "["; + ss.precision(1); + ss << std::fixed << fps; + ss << " fps] " << applicationName; + glfwSetWindowTitle(window, ss.str().c_str()); } vkDeviceWaitIdle(device->GetVkDevice()); diff --git a/src/shaders/compute.comp b/src/shaders/compute.comp index 0fd0224..e30b087 100644 --- a/src/shaders/compute.comp +++ b/src/shaders/compute.comp @@ -1,6 +1,16 @@ #version 450 #extension GL_ARB_separate_shader_objects : enable +#define ORIENTATION_CULLING true +#define VIEW_FRUSTRUM_CULLING true +#define DISTANCE_CULLING true + +#define VIEW_FRUSTRUM_CULLING_TOLERANCE 1.0 +#define DISTANCE_CULLING_N_BUCKETS 16 +#define DISTANCE_CULLING_MAX 32 + +#define WIND_INTENSITY 1.0 + #define WORKGROUP_SIZE 32 layout(local_size_x = WORKGROUP_SIZE, local_size_y = 1, local_size_z = 1) in; @@ -21,7 +31,7 @@ struct Blade { vec4 up; }; -// TODO: Add bindings to: +// DONE: Add bindings to: // 1. Store the input blades // 2. Write out the culled blades // 3. Write the total number of blades remaining @@ -36,21 +46,151 @@ struct Blade { // uint firstInstance; // = 0 // } numBlades; +layout(set = 2, binding = 0) buffer Blades { + Blade blades[]; +}; + +layout(set = 2, binding = 1) buffer CulledBlades { + Blade culledBlades[]; +}; + +layout(set = 2, binding = 2) buffer NumBlades { + uint vertexCount; // Write the number of blades remaining here + uint instanceCount; // = 1 + uint firstVertex; // = 0 + uint firstInstance; // = 0 +} numBlades; + + bool inBounds(float value, float bounds) { return (value >= -bounds) && (value <= bounds); } void main() { + uint idx = gl_GlobalInvocationID.x; + // Reset the number of blades to 0 - if (gl_GlobalInvocationID.x == 0) { - // numBlades.vertexCount = 0; + if (idx == 0) { + numBlades.vertexCount = 0; } barrier(); // Wait till all threads reach this point - // TODO: Apply forces on every blade and update the vertices in the buffer + // DONE: Apply forces on every blade and update the vertices in the buffer + Blade blade = blades[idx]; + vec3 v0 = blade.v0.xyz; // Position + vec3 v1 = blade.v1.xyz; // Bezier point + vec3 v2 = blade.v2.xyz; // Physical model guide + vec3 up = blade.up.xyz; // Up vector + float direction = blade.v0.w; + float height = blade.v1.w; + float width = blade.v2.w; + float stiffness_coeff = blade.up.w; + + // GRAVITY + vec3 gravity_direction = vec3(0.0, -1.0, 0.0); + float gravity_magnitude_of_acceleration = 9.81; + vec3 gE = gravity_direction * gravity_magnitude_of_acceleration; + + vec3 rotation = normalize(vec3(cos(direction), 0.0, sin(direction))); + vec3 f = normalize(cross(up, rotation)); + + vec3 gF = 0.25 * length(gE) * f; + + vec3 gravity = gE + gF; + + + // RECOVERY + vec3 iv2 = v0 + up * height; + vec3 recovery = (iv2 - v2) * stiffness_coeff; + + // WIND + vec3 windFunction = vec3(cos(totalTime), 0.0, sin(totalTime)); + float directional_alignment = 1.0 - abs(dot(normalize(windFunction), normalize(v2 - v0))); + float height_ratio = dot(v2 - v0, up) / height; + vec3 wind = WIND_INTENSITY * windFunction * directional_alignment * height_ratio; + + // TOTAL FORCE + vec3 tv2 = (gravity + recovery + wind) * deltaTime; + + v2 += tv2; - // TODO: Cull blades that are too far away or not in the camera frustum and write them + // STATE VALIDATION + + // Ensure a position of v2 above the local plane + v2 = v2 - up * min(dot(up, v2 - v0), 0.0); + + // Constrain v1 to always be above v0 + float l_proj = length(v2 - v0 - up * dot(v2 - v0, up)); + v1 = v0 + height * up * max(1.0 - (l_proj / height), 0.05 * max(l_proj / height, 1.0)); + + // Ensure that the length of the Bezier curve is not larger than the height of the blade + float L0 = length(v2 - v0); + float L1 = length(v2 - v1) + length(v1 - v0); + float L = (2.0 * L0 + 2.0 * L1) * 0.25; // approximation for the length L of a Bezier curve of degree 3 + float r = height / L; + + v1 = v0 + r * (v1 - v0); + v2 = v1 + r * (v2 - v1); + + blade.v1.xyz = v1; + blade.v2.xyz = v2; + + blades[idx] = blade; + + + // DONE: Cull blades that are too far away or not in the camera frustum and write them // to the culled blades buffer // Note: to do this, you will need to use an atomic operation to read and update numBlades.vertexCount // You want to write the visible blades to the buffer without write conflicts between threads + + if (ORIENTATION_CULLING || VIEW_FRUSTRUM_CULLING || DISTANCE_CULLING) { + + // Taking the inverse of the view matrix transforms from view space back to world space + // Translation component (position) stored in index 3 + vec3 cam_pos_world = inverse(camera.view)[3].xyz; + + // Viewing Direction: v0 - cam_pos_world + // Perpendicular Component of the viewing vector along the up axis: up * dot(v0 - cam_pos_world, up) + // By subtracting the Perpendicular Component from the Viewing Direction, we remove the component along up, leaving only the component that is coplanar with the grass blade + vec3 cam_view_dir_world_projected = v0 - cam_pos_world - up * dot(v0 - cam_pos_world, up); + + // ORIENTATION CULLING + if (ORIENTATION_CULLING && abs(dot(normalize(cam_view_dir_world_projected), normalize(cross(up, rotation)))) < cos(radians(80.0))) { + return; + } + + // VIEW-FRUSTRUM CULLING + if (VIEW_FRUSTRUM_CULLING) { + vec3 mid = 0.25 * v0 * 0.5 * v1 * 0.25 * v2; + + vec4 ndc_v0 = camera.proj * camera.view * vec4(v0, 1.0); + vec4 ndc_v2 = camera.proj * camera.view * vec4(v2, 1.0); + vec4 ndc_mid = camera.proj * camera.view * vec4(mid, 1.0); + + float v0_h = ndc_v0.w + VIEW_FRUSTRUM_CULLING_TOLERANCE; + float v2_h = ndc_v2.w + VIEW_FRUSTRUM_CULLING_TOLERANCE; + float mid_h = ndc_mid.w + VIEW_FRUSTRUM_CULLING_TOLERANCE; + + if (!inBounds(ndc_v0.x , v0_h ) && !inBounds(ndc_v0.y , v0_h ) && !inBounds(ndc_v0.z , v0_h ) && + !inBounds(ndc_v2.x , v2_h ) && !inBounds(ndc_v2.y , v2_h ) && !inBounds(ndc_v2.z , v2_h ) && + !inBounds(ndc_mid.x, mid_h) && !inBounds(ndc_mid.y, mid_h) && !inBounds(ndc_mid.z, mid_h) + ) { + return; + } + } + + // DISTANCE CULLING + if (DISTANCE_CULLING) { + float d_proj = length(cam_view_dir_world_projected); + float d_max = DISTANCE_CULLING_MAX; + int n = DISTANCE_CULLING_N_BUCKETS; + + if ((idx % n) < int(n * (1.0 - d_proj / d_max))) { + return; + } + } + } + + uint atomicIdx = atomicAdd(numBlades.vertexCount, 1); + culledBlades[atomicIdx] = blades[idx]; } diff --git a/src/shaders/grass.frag b/src/shaders/grass.frag index c7df157..318077d 100644 --- a/src/shaders/grass.frag +++ b/src/shaders/grass.frag @@ -6,12 +6,23 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 proj; } camera; -// TODO: Declare fragment shader inputs +// DONE: Declare fragment shader inputs + +layout(location = 0) in vec3 frag_pos; // Blade Position in world space +layout(location = 1) in vec3 frag_n; // Blade Normal in world space +layout(location = 2) in vec2 frag_uv; layout(location = 0) out vec4 outColor; void main() { - // TODO: Compute fragment color + // DONE: Compute fragment color + + vec3 grass_color = vec3(0.25, 0.75, 0.25); + + vec3 light_dir = normalize(vec3(0.33, -0.33, -0.33)); + float lambert = clamp(dot(frag_n, light_dir), 0.25, 0.99); + + vec4 lambert_color = vec4(grass_color * lambert, 1.0); - outColor = vec4(1.0); + outColor = mix(vec4(frag_uv.x, frag_uv.y, 0.0, 1.0), lambert_color, 1.0); } diff --git a/src/shaders/grass.tesc b/src/shaders/grass.tesc index f9ffd07..8c5b932 100644 --- a/src/shaders/grass.tesc +++ b/src/shaders/grass.tesc @@ -8,19 +8,36 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 proj; } camera; -// TODO: Declare tessellation control shader inputs and outputs +// DONE: Declare tessellation control shader inputs and outputs + +in gl_PerVertex +{ + vec4 gl_Position; // Blade Position in world space (xyz) and direction (w) +} gl_in[gl_MaxPatchVertices]; + +layout(location = 0) in vec4 tesc_v1[]; // Blade Bezier point in world space (xyz) and height (w) +layout(location = 1) in vec4 tesc_v2[]; // Blade Physical model guide in world space (xyz) and width (w) +layout(location = 2) in vec4 tesc_up[]; // Blade Up vector in world space (xyz) and stiffness coefficient (w) + +// out vec4 gl_Position; // Blade Position in world space (xyz) and direction (w) +layout(location = 0) out vec4 tese_v1[]; // Blade Bezier point in world space (xyz) and height (w) +layout(location = 1) out vec4 tese_v2[]; // Blade Physical model guide in world space (xyz) and width (w) +layout(location = 2) out vec4 tese_up[]; // Blade Up vector in world space (xyz) and stiffness coefficient (w) void main() { // Don't move the origin location of the patch gl_out[gl_InvocationID].gl_Position = gl_in[gl_InvocationID].gl_Position; - // TODO: Write any shader outputs + // DONE: Write any shader outputs + tese_v1[gl_InvocationID] = tesc_v1[gl_InvocationID]; + tese_v2[gl_InvocationID] = tesc_v2[gl_InvocationID]; + tese_up[gl_InvocationID] = tesc_up[gl_InvocationID]; - // TODO: Set level of tesselation - // gl_TessLevelInner[0] = ??? - // gl_TessLevelInner[1] = ??? - // gl_TessLevelOuter[0] = ??? - // gl_TessLevelOuter[1] = ??? - // gl_TessLevelOuter[2] = ??? - // gl_TessLevelOuter[3] = ??? + // DONE: Set level of tesselation + gl_TessLevelInner[0] = 10; + gl_TessLevelInner[1] = 10; + gl_TessLevelOuter[0] = 10; + gl_TessLevelOuter[1] = 10; + gl_TessLevelOuter[2] = 10; + gl_TessLevelOuter[3] = 10; } diff --git a/src/shaders/grass.tese b/src/shaders/grass.tese index 751fff6..59e4a9b 100644 --- a/src/shaders/grass.tese +++ b/src/shaders/grass.tese @@ -8,11 +8,48 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 proj; } camera; -// TODO: Declare tessellation evaluation shader inputs and outputs +// DONE: Declare tessellation evaluation shader inputs and outputs + +// in vec4 gl_Position; // Blade Position in world space (xyz) and direction (w) +layout(location = 0) in vec4 tese_v1[]; // Blade Bezier point in world space (xyz) and height (w) +layout(location = 1) in vec4 tese_v2[]; // Blade Physical model guide in world space (xyz) and width (w) +layout(location = 2) in vec4 tese_up[]; // Blade Up vector in world space (xyz) and stiffness coefficient (w) + +// out vec4 gl_Position; // Blade Position in clip space +layout(location = 0) out vec3 frag_pos; // Blade Position in world space +layout(location = 1) out vec3 frag_n; // Blade Normal in world space +layout(location = 2) out vec2 frag_uv; void main() { float u = gl_TessCoord.x; float v = gl_TessCoord.y; - // TODO: Use u and v to parameterize along the grass blade and output positions for each vertex of the grass blade + // DONE: Use u and v to parameterize along the grass blade and output positions for each vertex of the grass blade + + vec3 v0 = gl_in[0].gl_Position.xyz; + vec3 v1 = tese_v1[0].xyz; + vec3 v2 = tese_v2[0].xyz; + + vec3 a = v0 + v * (v1 - v0); + vec3 b = v1 + v * (v2 - v1); + vec3 c = a + v * (b - a); + + float direction = gl_in[0].gl_Position.w; + vec3 t1 = normalize(vec3(cos(direction), 0.0, sin(direction))); + + float width = tese_v2[0].w; + vec3 c0 = c - width * t1; + vec3 c1 = c + width * t1; + + vec3 t0 = normalize(b - a); + + frag_n = normalize(cross(t0, t1)); + + float t = u + 0.5 * v - u * v; + vec4 world_pos = vec4((1.0 - t) * c0 + t * c1, 1.0); + frag_pos = world_pos.xyz; + + gl_Position = camera.proj * camera.view * world_pos; + + frag_uv = vec2(u, v); } diff --git a/src/shaders/grass.vert b/src/shaders/grass.vert index db9dfe9..5f65ffd 100644 --- a/src/shaders/grass.vert +++ b/src/shaders/grass.vert @@ -6,12 +6,26 @@ layout(set = 1, binding = 0) uniform ModelBufferObject { mat4 model; }; -// TODO: Declare vertex shader inputs and outputs +// DONE: Declare vertex shader inputs and outputs + +layout(location = 0) in vec4 vert_v0; // Blade Position in local space (xyz) and direction (w) +layout(location = 1) in vec4 vert_v1; // Blade Bezier point in local space (xyz) and height (w) +layout(location = 2) in vec4 vert_v2; // Blade Physical model guide in local space (xyz) and width (w) +layout(location = 3) in vec4 vert_up; // Blade Up vector in local space (xyz) and stiffness coefficient (w) out gl_PerVertex { - vec4 gl_Position; + vec4 gl_Position; // Blade Position in world space (xyz) and direction (w) }; +layout(location = 0) out vec4 tesc_v1; // Blade Bezier point in world space (xyz) and height (w) +layout(location = 1) out vec4 tesc_v2; // Blade Physical model guide in world space (xyz) and width (w) +layout(location = 2) out vec4 tesc_up; // Blade Up vector in world space (xyz) and stiffness coefficient (w) + void main() { - // TODO: Write gl_Position and any other shader outputs + // DONE: Write gl_Position and any other shader outputs + + gl_Position = vec4((model * vec4(vert_v0.xyz, 1.0)).xyz, vert_v0.w); + tesc_v1 = vec4((model * vec4(vert_v1.xyz, 1.0)).xyz, vert_v1.w); + tesc_v2 = vec4((model * vec4(vert_v2.xyz, 1.0)).xyz, vert_v2.w); + tesc_up = vec4((model * vec4(vert_up.xyz, 1.0)).xyz, vert_up.w); }