diff --git a/README.md b/README.md index 20ee451..7b8730a 100644 --- a/README.md +++ b/README.md @@ -3,10 +3,97 @@ Vulkan Grass Rendering **University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 5** -* (TODO) YOUR NAME HERE -* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab) +* Harris Kokkinakos + * [LinkedIn](https://www.linkedin.com/in/haralambos-kokkinakos-5311a3210/), [personal website](https://harriskoko.github.io/Harris-Projects/) +* Tested on: Windows 24H2, i9-12900H @ 2.50GHz 16GB, RTX 3070TI Mobile + +### Description +This project implements a Vulkan-based version of Responsive Real-Time Grass Rendering, adapted from Jahrmann & Wimmer (2017). +The goal of this project is to produce a satisfying and physically accurate representation of grass. +A compute pass performs physics evaluation and culling, while the tessellation pipeline generates detailed curved blades using control points derived from a quadratic Bézier model. + +RESULTS +================ +![gif](img/my_grass.gif) + +IMPLEMENTATION +================ + +### Grass Representation +Each blade is defined by three control points v0, v1, v2 forming a quadratic Bézier curve. Additionally, each blade has attributes including height, width, stiffness, up vector, and orientation. +* v0 = root fixed to terrain +* v2 = tip affected by forces +* v1 = intermediate control derived from v0,v2 + +![bez](img/blade_model.jpg) + +### Physics +For each frame, the compute shader updates all blades in parallel. It computes gravity, recovery, and wind forces and applies them to each grass blade. + +We seperate gravity into two terms, environmental (gE) and front (gF). + +gE is the environmental gravity vector applied uniformly to all blades. +It represents the constant downward pull of gravity on the tip of each blade, modeled as: + +![ge](img/ge.png) + +gF is the front-facing gravity component, added to tilt the blade slightly in the direction it’s facing, producing a more natural lean instead of purely vertical bending. + +![gf](img/gf.png) + +These two forces are added together to get the total gravity force. + +The recovery force restores each blade tip back toward its rest position. This counteracts bending caused by gravity and wind. It is like a spring damping force. It is modeled as: + +![r](img/r.png) + +In this implementation, I use 0.1 in place of the final term in order to increase simplicty without reducing quality. + +The wind force introduces dynamic, time-dependent bending to simulate airflow across the grass field. +Instead of using precomputed flow fields like in the paper, this implementation defines wind procedurally using trigonometric variation over both time and position, giving a natural wave motion that travels across the scene. + +All contributions sum into a total force F = G + R + W, updating v2 with time-step Δt. +The algorithm enforces length preservation and clamps vertical penetration, mirroring section 5.2 of the paper’s responsive model + +### Culling +To maintain real-time performance, grass blades are culled directly on the GPU before rendering. +Each compute shader invocation decides whether a blade should be drawn based on its orientation, visibility, and distance relative to the camera. +Blades that pass all culling tests are written into a culled buffer and counted atomically for indirect drawing. + +Orientation culling removes blades that are almost parallel to the camera’s view direction (in this case, within 10%). +When a blade is seen nearly from the side, its thin geometry contributes little visually but adds unnecessary tessellation cost. + +The algorithm computes the dot product between the camera forward vector and the blade’s front direction: + +![ori](img/ori.png) + +Frustum culling discards blades outside the camera’s visible volume. +Each blade’s base (v0), tip (v2), and midpoint (m) are transformed into clip space: + +![view](img/frustum.png) + +If all three points lie outside the frustum, the blade is culled. +This ensures that only blades potentially visible in the camera’s view are sent to tessellation and rasterization. + +Distance culling removes blades too far from the camera (in this case, more than 25 units away). +It computes the projected horizontal distance from the blade’s root (v0) to the camera position (camPos): + +![dist](img/dist.png) + +If a grass blade passes all three of these tests, it can be rendered. + +### Performance + +This Vulkan Grass Renderer was tested at varying numbers of grass blades as shown below. + +![p1](img/performance.png) + +As we from this chart, the performance of this renderer is able to produce high frame rates even through extremely high numbers of grass blades to simulate. The extreme fall off of performance is because it is tested on exponentially increasing numbers of grass blades. + +Additionally, we can calculate the performance increase due to the culling optimizations. + +![p2](img/culling.png) + +As this chart shows, at 65536 grass blades, there is almost a 100FPS improvement using the three culling methods implemented. This equates to over a 50% speedup for the renderer, proving culling to be a substantial improvement. This test primarily focuses on frustum and orientation culling as the distance culling was not utilized since the camera was close to the grass. Distance culling adds even further improvement to games and rendering when we do not want to render grass that is far away from the camera/player. -### (TODO: Your README) -*DO NOT* leave the README to the last minute! It is a crucial part of the -project, and we will not be able to grade you without a good README. diff --git a/img/culling.png b/img/culling.png new file mode 100644 index 0000000..e7ca592 Binary files /dev/null and b/img/culling.png differ diff --git a/img/dist.png b/img/dist.png new file mode 100644 index 0000000..c409458 Binary files /dev/null and b/img/dist.png differ diff --git a/img/dproj.png b/img/dproj.png new file mode 100644 index 0000000..741701a Binary files /dev/null and b/img/dproj.png differ diff --git a/img/frustum.png b/img/frustum.png new file mode 100644 index 0000000..6fb3629 Binary files /dev/null and b/img/frustum.png differ diff --git a/img/ge.png b/img/ge.png new file mode 100644 index 0000000..1d4caf3 Binary files /dev/null and b/img/ge.png differ diff --git a/img/gf.png b/img/gf.png new file mode 100644 index 0000000..24cf86b Binary files /dev/null and b/img/gf.png differ diff --git a/img/my_grass.gif b/img/my_grass.gif new file mode 100644 index 0000000..fbd73fb Binary files /dev/null and b/img/my_grass.gif differ diff --git a/img/ori.png b/img/ori.png new file mode 100644 index 0000000..b4bbd87 Binary files /dev/null and b/img/ori.png differ diff --git a/img/performance.png b/img/performance.png new file mode 100644 index 0000000..2a094d8 Binary files /dev/null and b/img/performance.png differ diff --git a/img/r.png b/img/r.png new file mode 100644 index 0000000..4f2d873 Binary files /dev/null and b/img/r.png differ diff --git a/src/Blades.cpp b/src/Blades.cpp index 80e3d76..fc15fde 100644 --- a/src/Blades.cpp +++ b/src/Blades.cpp @@ -45,7 +45,7 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode indirectDraw.firstInstance = 0; BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, bladesBuffer, bladesBufferMemory); - BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory); + BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory); BufferUtils::CreateBufferFromData(device, commandPool, &indirectDraw, sizeof(BladeDrawIndirect), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT, numBladesBuffer, numBladesBufferMemory); } @@ -68,4 +68,4 @@ Blades::~Blades() { vkFreeMemory(device->GetVkDevice(), culledBladesBufferMemory, nullptr); vkDestroyBuffer(device->GetVkDevice(), numBladesBuffer, nullptr); vkFreeMemory(device->GetVkDevice(), numBladesBufferMemory, nullptr); -} +} \ No newline at end of file diff --git a/src/Blades.h b/src/Blades.h index 9bd1eed..f329d50 100644 --- a/src/Blades.h +++ b/src/Blades.h @@ -4,7 +4,7 @@ #include #include "Model.h" -constexpr static unsigned int NUM_BLADES = 1 << 13; +constexpr static unsigned int NUM_BLADES = 1 << 12; constexpr static float MIN_HEIGHT = 1.3f; constexpr static float MAX_HEIGHT = 2.5f; constexpr static float MIN_WIDTH = 0.1f; diff --git a/src/Renderer.cpp b/src/Renderer.cpp index b445d04..ca7d383 100644 --- a/src/Renderer.cpp +++ b/src/Renderer.cpp @@ -9,7 +9,7 @@ static constexpr unsigned int WORKGROUP_SIZE = 32; Renderer::Renderer(Device* device, SwapChain* swapChain, Scene* scene, Camera* camera) - : device(device), + : device(device), logicalDevice(device->GetVkDevice()), swapChain(swapChain), scene(scene), @@ -198,6 +198,41 @@ void Renderer::CreateComputeDescriptorSetLayout() { // TODO: Create the descriptor set layout for the compute pipeline // Remember this is like a class definition stating why types of information // will be stored at each binding + // Input Blades + VkDescriptorSetLayoutBinding inputBladesBinding = {}; + inputBladesBinding.binding = 0; + inputBladesBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + inputBladesBinding.descriptorCount = 1; + inputBladesBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + inputBladesBinding.pImmutableSamplers = nullptr; + + // Culled Blades + VkDescriptorSetLayoutBinding culledBladesBinding = {}; + culledBladesBinding.binding = 1; + culledBladesBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + culledBladesBinding.descriptorCount = 1; + culledBladesBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + culledBladesBinding.pImmutableSamplers = nullptr; + + // Num Blades + VkDescriptorSetLayoutBinding numBladesBinding = {}; + numBladesBinding.binding = 2; + numBladesBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + numBladesBinding.descriptorCount = 1; + numBladesBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + numBladesBinding.pImmutableSamplers = nullptr; + + std::vector bindings = { inputBladesBinding, culledBladesBinding, numBladesBinding }; + + // Descriptor set layout + VkDescriptorSetLayoutCreateInfo layoutInfo = {}; + layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; + layoutInfo.bindingCount = static_cast(bindings.size()); + layoutInfo.pBindings = bindings.data(); + + if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &computeDescriptorSetLayout) != VK_SUCCESS) { + throw std::runtime_error("Failed to create compute descriptor set layout"); + } } void Renderer::CreateDescriptorPool() { @@ -216,6 +251,8 @@ void Renderer::CreateDescriptorPool() { { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 }, // TODO: Add any additional types and counts of descriptors you will need to allocate + // 3 storage buffers. input, culled, and num blades + { VK_DESCRIPTOR_TYPE_STORAGE_BUFFER , static_cast(3 * scene->GetBlades().size()) }, }; VkDescriptorPoolCreateInfo poolInfo = {}; @@ -318,8 +355,42 @@ void Renderer::CreateModelDescriptorSets() { } void Renderer::CreateGrassDescriptorSets() { - // TODO: Create Descriptor sets for the grass. - // This should involve creating descriptor sets which point to the model matrix of each group of grass blades + grassDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the descriptor set + VkDescriptorSetLayout layouts[] = { modelDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(grassDescriptorSets.size()); + allocInfo.pSetLayouts = layouts; + + // Allocate descriptor sets + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassDescriptorSets.data()) != VK_SUCCESS) { + throw std::runtime_error("Failed to allocate descriptor set"); + } + + std::vector descriptorWrites(grassDescriptorSets.size()); + + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + VkDescriptorBufferInfo modelBufferInfo = {}; + modelBufferInfo.buffer = scene->GetBlades()[i]->GetModelBuffer(); + modelBufferInfo.offset = 0; + modelBufferInfo.range = sizeof(ModelBufferObject); + + descriptorWrites[i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[i].dstSet = grassDescriptorSets[i]; + descriptorWrites[i].dstBinding = 0; + descriptorWrites[i].dstArrayElement = 0; + descriptorWrites[i].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; + descriptorWrites[i].descriptorCount = 1; + descriptorWrites[i].pBufferInfo = &modelBufferInfo; + descriptorWrites[i].pImageInfo = nullptr; + descriptorWrites[i].pTexelBufferView = nullptr; + } + + // Update descriptor sets + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); } void Renderer::CreateTimeDescriptorSet() { @@ -360,6 +431,70 @@ void Renderer::CreateTimeDescriptorSet() { void Renderer::CreateComputeDescriptorSets() { // TODO: Create Descriptor sets for the compute pipeline // The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades + computeDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the descriptor set + VkDescriptorSetLayout layouts[] = { computeDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(computeDescriptorSets.size()); + allocInfo.pSetLayouts = layouts; + + // Allocate descriptor sets + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, computeDescriptorSets.data()) != VK_SUCCESS) { + throw std::runtime_error("Failed to allocate compute descriptor set"); + } + + std::vector descriptorWrites(3 * computeDescriptorSets.size()); + + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + // Binding 0: Input blades buffer + VkDescriptorBufferInfo inputBladesBufferInfo = {}; + inputBladesBufferInfo.buffer = scene->GetBlades()[i]->GetBladesBuffer(); + inputBladesBufferInfo.offset = 0; + inputBladesBufferInfo.range = NUM_BLADES * sizeof(Blade); + + descriptorWrites[3 * i + 0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 0].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 0].dstBinding = 0; + descriptorWrites[3 * i + 0].dstArrayElement = 0; + descriptorWrites[3 * i + 0].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 0].descriptorCount = 1; + descriptorWrites[3 * i + 0].pBufferInfo = &inputBladesBufferInfo; + + // Binding 1: Culled blades buffer + VkDescriptorBufferInfo culledBladesBufferInfo = {}; + culledBladesBufferInfo.buffer = scene->GetBlades()[i]->GetCulledBladesBuffer(); + culledBladesBufferInfo.offset = 0; + culledBladesBufferInfo.range = NUM_BLADES * sizeof(Blade); + + descriptorWrites[3 * i + 1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 1].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 1].dstBinding = 1; + descriptorWrites[3 * i + 1].dstArrayElement = 0; + descriptorWrites[3 * i + 1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 1].descriptorCount = 1; + descriptorWrites[3 * i + 1].pBufferInfo = &culledBladesBufferInfo; + + // Binding 2: Num blades buffer + VkDescriptorBufferInfo numBladesBufferInfo = {}; + numBladesBufferInfo.buffer = scene->GetBlades()[i]->GetNumBladesBuffer(); + numBladesBufferInfo.offset = 0; + numBladesBufferInfo.range = sizeof(BladeDrawIndirect); + + descriptorWrites[3 * i + 2].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 2].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 2].dstBinding = 2; + descriptorWrites[3 * i + 2].dstArrayElement = 0; + descriptorWrites[3 * i + 2].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 2].descriptorCount = 1; + descriptorWrites[3 * i + 2].pBufferInfo = &numBladesBufferInfo; + } + + // Update descriptor sets + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); + } void Renderer::CreateGraphicsPipeline() { @@ -717,7 +852,7 @@ void Renderer::CreateComputePipeline() { computeShaderStageInfo.pName = "main"; // TODO: Add the compute dsecriptor set layout you create to this list - std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout }; + std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout, computeDescriptorSetLayout}; // Create pipeline layout VkPipelineLayoutCreateInfo pipelineLayoutInfo = {}; @@ -795,11 +930,11 @@ void Renderer::CreateFrameResources() { ); depthImageView = Image::CreateView(device, depthImage, depthFormat, VK_IMAGE_ASPECT_DEPTH_BIT); - + // Transition the image for use as depth-stencil Image::TransitionLayout(device, graphicsCommandPool, depthImage, depthFormat, VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL); - + // CREATE FRAMEBUFFERS framebuffers.resize(swapChain->GetCount()); for (size_t i = 0; i < swapChain->GetCount(); i++) { @@ -884,6 +1019,10 @@ void Renderer::RecordComputeCommandBuffer() { vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 1, 1, &timeDescriptorSet, 0, nullptr); // TODO: For each group of blades bind its descriptor set and dispatch + for (int i = 0; i < computeDescriptorSets.size(); i++) { + vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 2, 1, &computeDescriptorSets[i], 0, nullptr); + vkCmdDispatch(computeCommandBuffer, (NUM_BLADES + WORKGROUP_SIZE - 1) / WORKGROUP_SIZE, 1, 1); + } // ~ End recording ~ if (vkEndCommandBuffer(computeCommandBuffer) != VK_SUCCESS) { @@ -975,14 +1114,13 @@ void Renderer::RecordCommandBuffers() { for (uint32_t j = 0; j < scene->GetBlades().size(); ++j) { VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() }; VkDeviceSize offsets[] = { 0 }; - // TODO: Uncomment this when the buffers are populated - // vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); + vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); - // TODO: Bind the descriptor set for each grass blades model + // Bind the descriptor set for each grass blades model + vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipelineLayout, 1, 1, &grassDescriptorSets[j], 0, nullptr); // Draw - // TODO: Uncomment this when the buffers are populated - // vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); + vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); } // End render pass @@ -1045,7 +1183,7 @@ Renderer::~Renderer() { vkFreeCommandBuffers(logicalDevice, graphicsCommandPool, static_cast(commandBuffers.size()), commandBuffers.data()); vkFreeCommandBuffers(logicalDevice, computeCommandPool, 1, &computeCommandBuffer); - + vkDestroyPipeline(logicalDevice, graphicsPipeline, nullptr); vkDestroyPipeline(logicalDevice, grassPipeline, nullptr); vkDestroyPipeline(logicalDevice, computePipeline, nullptr); @@ -1054,9 +1192,11 @@ Renderer::~Renderer() { vkDestroyPipelineLayout(logicalDevice, grassPipelineLayout, nullptr); vkDestroyPipelineLayout(logicalDevice, computePipelineLayout, nullptr); + vkDestroyDescriptorSetLayout(logicalDevice, cameraDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, modelDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, timeDescriptorSetLayout, nullptr); + vkDestroyDescriptorSetLayout(logicalDevice, computeDescriptorSetLayout, nullptr); vkDestroyDescriptorPool(logicalDevice, descriptorPool, nullptr); @@ -1064,4 +1204,4 @@ Renderer::~Renderer() { DestroyFrameResources(); vkDestroyCommandPool(logicalDevice, computeCommandPool, nullptr); vkDestroyCommandPool(logicalDevice, graphicsCommandPool, nullptr); -} +} \ No newline at end of file diff --git a/src/Renderer.h b/src/Renderer.h index 95e025f..8088667 100644 --- a/src/Renderer.h +++ b/src/Renderer.h @@ -79,4 +79,10 @@ class Renderer { std::vector commandBuffers; VkCommandBuffer computeCommandBuffer; + + // Added: + VkDescriptorSetLayout computeDescriptorSetLayout; + + std::vector grassDescriptorSets; + std::vector computeDescriptorSets; }; diff --git a/src/shaders/compute.comp b/src/shaders/compute.comp index 0fd0224..d4ec568 100644 --- a/src/shaders/compute.comp +++ b/src/shaders/compute.comp @@ -1,6 +1,6 @@ +// Grass forces and culling based on Responsive Real-Time Grass Rendering for General 3D Scenes Paper #version 450 #extension GL_ARB_separate_shader_objects : enable - #define WORKGROUP_SIZE 32 layout(local_size_x = WORKGROUP_SIZE, local_size_y = 1, local_size_z = 1) in; @@ -21,36 +21,160 @@ struct Blade { vec4 up; }; -// TODO: Add bindings to: +// Add bindings to: // 1. Store the input blades +layout(set = 2, binding = 0) buffer InputBlades { + Blade inputBlades[]; +}; + // 2. Write out the culled blades -// 3. Write the total number of blades remaining -// The project is using vkCmdDrawIndirect to use a buffer as the arguments for a draw call -// This is sort of an advanced feature so we've showed you what this buffer should look like -// -// layout(set = ???, binding = ???) buffer NumBlades { -// uint vertexCount; // Write the number of blades remaining here -// uint instanceCount; // = 1 -// uint firstVertex; // = 0 -// uint firstInstance; // = 0 -// } numBlades; +layout(set = 2, binding = 1) buffer CulledBlades { + Blade culledBlades[]; +}; + +// 3. Write the total number of blades remaining -bool inBounds(float value, float bounds) { - return (value >= -bounds) && (value <= bounds); -} +layout(set = 2, binding = 2) buffer IndirectDrawBuffer { + uint vertexCount; + uint instanceCount; + uint firstVertex; + uint firstInstance; +} drawParams; void main() { - // Reset the number of blades to 0 - if (gl_GlobalInvocationID.x == 0) { - // numBlades.vertexCount = 0; - } - barrier(); // Wait till all threads reach this point - - // TODO: Apply forces on every blade and update the vertices in the buffer - - // TODO: Cull blades that are too far away or not in the camera frustum and write them - // to the culled blades buffer - // Note: to do this, you will need to use an atomic operation to read and update numBlades.vertexCount - // You want to write the visible blades to the buffer without write conflicts between threads -} + uint bladeIndex = gl_GlobalInvocationID.x; + + if (bladeIndex == 0) { + drawParams.vertexCount = 0; + } + barrier(); + + if (bladeIndex < 8192) { + Blade currentBlade = inputBlades[bladeIndex]; + + vec3 v0 = currentBlade.v0.xyz; + vec3 v1 = currentBlade.v1.xyz; + vec3 v2 = currentBlade.v2.xyz; + vec3 up = currentBlade.up.xyz; + + float bladeHeight = currentBlade.v1.w; + float bladeWidth = currentBlade.v2.w; + float stiffness = currentBlade.up.w; + float bladeDirection = currentBlade.v0.w; + + // Calculate gravity force + vec4 D = vec4(0.0, -1.0, 0.0, 9.8); + + // Environmental gravity + vec3 gE = normalize(D.xyz) * D.w; + + // Front direction of the blade + vec3 orientationVec = vec3(cos(bladeDirection), 0.0, -sin(bladeDirection)); + vec3 frontDir = normalize(cross(up, orientationVec)); + + // Front gravity contribution + vec3 gF = 0.25f * length(gE) * frontDir; + + // Total gravity + vec3 g = gE + gF; + + // Recovery force + vec3 iv2 = v0 + bladeHeight * up; + vec3 r = (iv2 - v2) * stiffness ; + + // Wind force + vec3 windDir = normalize(vec3( + sin(totalTime * 0.5 + v0.x * 0.1), + 0.0, + cos(totalTime * 0.3 + v0.z * 0.1) + )); + + // Wind alignment + float windAlignment = dot(windDir, frontDir); + windAlignment = windAlignment * windAlignment; + + // Height ratio + float heightRatio = length(v2 - v0) / bladeHeight; + heightRatio = clamp(heightRatio, 0.0, 1.0); + + // Wind force magnitude + float windStrength = 8.0f; + vec3 w = windDir * windAlignment * heightRatio * windStrength; + + // Total force on v2 + vec3 totalForce = g + r + w; + + v2 += totalForce * deltaTime * 5.0; + + // From 5.2 of Responsive Real-Time Grass Rendering for General 3D Scenes + v2 = v2 - up * min(dot(up, v2 - v0), 0.0f); + + vec3 l_proj = v2 - v0 - up * (dot((v2 - v0), up)); + + v1 = v0 + bladeHeight * up * max(1.f - l_proj / bladeHeight, 0.05f * max(l_proj / bladeHeight, 1.0f)); + v2 -= up * min(dot(up, v2 - v0), 0.0f); + + float L0 = distance(v0, v2); + float L1 = distance(v0, v1) + distance(v1, v2); + float L = 0.5f * (L0 + L1); + float r2 = bladeHeight / L; + v1 = v0 + r2 * (v1 - v0); + v2 = v1 + r2 * (v2 - v1); + + bool visible = true; + + // Toggle these to enable/disable culling + const bool ENABLE_ORIENTATION_CULLING = true; + const bool ENABLE_FRUSTUM_CULLING = true; + const bool ENABLE_DISTANCE_CULLING = true; + + // Culling: start with visible = true then test each culling type. If it should be culled, set visible to false, skip rest and don't render this blade. + + // Orientation culling + vec3 cameraPos = vec3(inverse(camera.view)[3]); + vec3 viewDir = vec3(camera.view[0][2], camera.view[1][2], camera.view[2][2]); + if (ENABLE_ORIENTATION_CULLING && abs(dot(viewDir, frontDir)) > 0.9) { + visible = false; + } + + // Frustum culling + if (visible && ENABLE_FRUSTUM_CULLING) { + vec4 v0Clip = camera.proj * (camera.view * vec4(v0, 1.0)); + vec4 v2Clip = camera.proj * (camera.view * vec4(v2, 1.0)); + vec3 m = 0.25 * v0 + 0.5 * v1 + 0.25 * v2; + vec4 mClip = camera.proj * (camera.view * vec4(m, 1.0)); + + float tolerance = 1.5; + + bool v0InFrustum = abs(v0Clip.x) <= v0Clip.w * tolerance && abs(v0Clip.y) <= v0Clip.w * tolerance && v0Clip.z >= -v0Clip.w && v0Clip.z <= v0Clip.w; + + bool v2InFrustum = abs(v2Clip.x) <= v2Clip.w * tolerance && abs(v2Clip.y) <= v2Clip.w * tolerance && v2Clip.z >= -v2Clip.w && v2Clip.z <= v2Clip.w; + + bool mInFrustum = abs(mClip.x) <= mClip.w * tolerance && abs(mClip.y) <= mClip.w * tolerance && mClip.z >= -mClip.w && mClip.z <= mClip.w; + + if (!v0InFrustum && !v2InFrustum && !mInFrustum) { + visible = false; + } + } + + //Distance Culling + if (visible && ENABLE_DISTANCE_CULLING) { + vec3 camPos = inverse(camera.view)[3].xyz; + float distance_projection = length(v0 - camPos - up * (dot(v0 - camPos, up))); + + if (distance_projection > 25.0) { + visible = false; + } + } + + if (visible) { + Blade updatedBlade = currentBlade; + updatedBlade.v1 = vec4(v1, bladeHeight); + updatedBlade.v2 = vec4(v2, bladeWidth); + + uint writeIndex = atomicAdd(drawParams.vertexCount, 1); + culledBlades[writeIndex] = updatedBlade; + } + } +} \ No newline at end of file diff --git a/src/shaders/grass.frag b/src/shaders/grass.frag index c7df157..79a75a3 100644 --- a/src/shaders/grass.frag +++ b/src/shaders/grass.frag @@ -6,12 +6,28 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 proj; } camera; -// TODO: Declare fragment shader inputs +// Input from tessellation evaluation shader +layout(location = 0) in vec3 fsNormal; +layout(location = 1) in float fsHeightRatio; -layout(location = 0) out vec4 outColor; +layout(location = 0) out vec4 outFragColor; void main() { - // TODO: Compute fragment color - - outColor = vec4(1.0); -} + // Define grass color palette + vec4 grassDark = vec4(0.0, 0.1, 0.0, 1.0); // Dark green at base + vec4 grassMid = vec4(0.0, 0.5, 0.0, 1.0); // Mid green in middle + vec4 grassLight = vec4(0.5, 0.8, 0.5, 1.0); // Light green at tip + + // Calculate lighting based on surface normal + vec3 upVector = vec3(0.0, 1.0, 0.0); + float lightIntensity = abs(dot(upVector, fsNormal)); + + // Blend between dark and light based on lighting + vec4 litColor = mix(grassDark, grassLight, lightIntensity); + + // Blend between mid and light based on height along blade + vec4 heightColor = mix(grassMid, grassLight, fsHeightRatio); + + // Combine lighting and height for final color + outFragColor = mix(litColor, heightColor, 0.5); +} \ No newline at end of file diff --git a/src/shaders/grass.tesc b/src/shaders/grass.tesc index f9ffd07..2f83c3b 100644 --- a/src/shaders/grass.tesc +++ b/src/shaders/grass.tesc @@ -1,26 +1,80 @@ #version 450 #extension GL_ARB_separate_shader_objects : enable -layout(vertices = 1) out; +layout(vertices = 4) out; layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 view; mat4 proj; } camera; +// Receive blade data from vertex shader +layout(location = 0) in vec3 tcBasePos[]; +layout(location = 1) in vec3 tcBezierPt[]; +layout(location = 2) in vec3 tcPhysicsPt[]; +layout(location = 3) in vec3 tcUpDir[]; +layout(location = 4) in float tcBladeWidth[]; + // TODO: Declare tessellation control shader inputs and outputs +// Pass blade data to tessellation evaluation shader +layout(location = 0) out vec3 teBasePos[]; +layout(location = 1) out vec3 teBezierPt[]; +layout(location = 2) out vec3 tePhysicsPt[]; +layout(location = 3) out vec3 teUpDir[]; +layout(location = 4) out float teBladeWidth[]; + +in gl_PerVertex { + vec4 gl_Position; +} gl_in[]; + +out gl_PerVertex { + vec4 gl_Position; +} gl_out[]; void main() { // Don't move the origin location of the patch gl_out[gl_InvocationID].gl_Position = gl_in[gl_InvocationID].gl_Position; - + // TODO: Write any shader outputs - - // TODO: Set level of tesselation - // gl_TessLevelInner[0] = ??? - // gl_TessLevelInner[1] = ??? - // gl_TessLevelOuter[0] = ??? - // gl_TessLevelOuter[1] = ??? - // gl_TessLevelOuter[2] = ??? - // gl_TessLevelOuter[3] = ??? -} + // Pass blade geometry data to evaluation shader + teBasePos[gl_InvocationID] = tcBasePos[gl_InvocationID]; + teBezierPt[gl_InvocationID] = tcBezierPt[gl_InvocationID]; + tePhysicsPt[gl_InvocationID] = tcPhysicsPt[gl_InvocationID]; + teUpDir[gl_InvocationID] = tcUpDir[gl_InvocationID]; + teBladeWidth[gl_InvocationID] = tcBladeWidth[gl_InvocationID]; + + // TODO: Set level of tessellation + // Calculate tessellation levels based on distance to camera + float distToCamera = length(tcBasePos[gl_InvocationID] - inverse(camera.view)[3].xyz); + float tessLevel = 4.0; + + if (distToCamera < 2.0) + tessLevel = 32.0; + else if (distToCamera < 4.0) + tessLevel = 24.0; + else if (distToCamera < 6.0) + tessLevel = 20.0; + else if (distToCamera < 8.0) + tessLevel = 16.0; + else if (distToCamera < 12.0) + tessLevel = 14.0; + else if (distToCamera < 16.0) + tessLevel = 12.0; + else if (distToCamera < 20.0) + tessLevel = 10.0; + else if (distToCamera < 24.0) + tessLevel = 8.0; + else if (distToCamera < 28.0) + tessLevel = 6.0; + else if (distToCamera < 32.0) + tessLevel = 4.0; + else + tessLevel = 2.0; + + gl_TessLevelInner[0] = tessLevel; + gl_TessLevelInner[1] = tessLevel; + gl_TessLevelOuter[0] = tessLevel; + gl_TessLevelOuter[1] = tessLevel; + gl_TessLevelOuter[2] = tessLevel; + gl_TessLevelOuter[3] = tessLevel; +} \ No newline at end of file diff --git a/src/shaders/grass.tese b/src/shaders/grass.tese index 751fff6..a6bc3e0 100644 --- a/src/shaders/grass.tese +++ b/src/shaders/grass.tese @@ -1,18 +1,44 @@ #version 450 #extension GL_ARB_separate_shader_objects : enable - layout(quads, equal_spacing, ccw) in; - layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 view; mat4 proj; } camera; +layout(location = 0) in vec3 teBasePos[]; +layout(location = 1) in vec3 teBezierPt[]; +layout(location = 2) in vec3 tePhysicsPt[]; +layout(location = 3) in vec3 teUpDir[]; +layout(location = 4) in float teBladeWidth[]; + // TODO: Declare tessellation evaluation shader inputs and outputs +layout(location = 0) out vec3 fsNormal; +layout(location = 1) out float fsHeightRatio; void main() { - float u = gl_TessCoord.x; + float u = gl_TessCoord.x * 1.5; float v = gl_TessCoord.y; - - // TODO: Use u and v to parameterize along the grass blade and output positions for each vertex of the grass blade -} + vec3 basePos = teBasePos[0]; + vec3 bezierPt = teBezierPt[0]; + vec3 physicsPt = tePhysicsPt[0]; + vec3 upDir = teUpDir[0]; + float bladeWidth = teBladeWidth[0]; + + // TODO: Use u and v to parameterize along the grass blade and output positions for each vertex of the grass blade + vec3 posAlongHeight = mix(basePos, bezierPt, v); + vec3 posAlongCurve = mix(posAlongHeight, mix(bezierPt, physicsPt, v), v); + + vec3 edgeLeft = posAlongCurve - bladeWidth * upDir; + vec3 edgeRight = posAlongCurve + bladeWidth * upDir; + + float edgeBlend = u + 0.5 * v - u * v; + vec3 finalPosition = mix(edgeLeft, edgeRight, edgeBlend); + + vec3 tangent = normalize(mix(bezierPt, physicsPt, v) - posAlongHeight); + fsNormal = normalize(cross(tangent, upDir)); + + fsHeightRatio = v; + + gl_Position = camera.proj * camera.view * vec4(finalPosition, 1.0); +} \ No newline at end of file diff --git a/src/shaders/grass.vert b/src/shaders/grass.vert index db9dfe9..4eeddd0 100644 --- a/src/shaders/grass.vert +++ b/src/shaders/grass.vert @@ -1,4 +1,3 @@ - #version 450 #extension GL_ARB_separate_shader_objects : enable @@ -7,11 +6,37 @@ layout(set = 1, binding = 0) uniform ModelBufferObject { }; // TODO: Declare vertex shader inputs and outputs +// Blade data from vertex buffer +layout(location = 0) in vec4 basePosition; // v0: position + direction +layout(location = 1) in vec4 bezierControl; // v1: bezier point + height +layout(location = 2) in vec4 physicsGuide; // v2: physics guide + width +layout(location = 3) in vec4 bladeUpVector; // up: up vector + stiffness + +// Pass blade data to tessellation control shader +layout(location = 0) out vec3 tcBasePos; +layout(location = 1) out vec3 tcBezierPt; +layout(location = 2) out vec3 tcPhysicsPt; +layout(location = 3) out vec3 tcUpDir; +layout(location = 4) out float tcBladeWidth; out gl_PerVertex { vec4 gl_Position; }; void main() { - // TODO: Write gl_Position and any other shader outputs -} + // Transform base position to world space + gl_Position = model * basePosition; + + // Pass blade geometry to next stage + tcBasePos = vec3(model * vec4(basePosition.xyz, 1.0)); + tcBezierPt = vec3(model * vec4(bezierControl.xyz, 1.0)); + tcPhysicsPt = vec3(model * vec4(physicsGuide.xyz, 1.0)); + + // Extract width and calculate up direction from stiffness value + tcBladeWidth = physicsGuide.w; + tcUpDir = normalize(vec3( + bladeUpVector.w * cos(basePosition.w), + 0.0, + bladeUpVector.w * sin(basePosition.w) + )); +} \ No newline at end of file