Open
Conversation
This is instruction for cross-compiling LLDB on FreeBSD based on @mgorny's [blog post](https://web.archive.org/web/20250827001729/https://www.moritz.systems/blog/freebsd-remote-process-plugin-on-non-x86-architectures/). Tested building arm64 binary on amd64 machine and building amd64 binary on arm64 machine. --------- Signed-off-by: Minsoo Choo <minsoochoo0122@proton.me> Co-authored-by: Michał Górny <mgorny@quansight.com>
) It erroneously merged the closing brace even when breaking after the opening brace. Fixes llvm#187444
…vm#187635) This argument should be used by ControllerAccess implementations to pass bootstrap information (process triple, page size, initial symbols and values) to the controller.
…ling code (llvm#184014) This patch introduce a new HandleModuleName function to avoid duplicated code snippet in module name handling stage. --------- Signed-off-by: yronglin <yronglin777@gmail.com> Signed-off-by: Wang, Yihan <yronglin777@gmail.com>
… of CIRDialectLLVMIRTranslationInterface (llvm#186073) Add the amendOperation override to handle CIR dialect attributes during MLIR-to-LLVM IR translation. This dispatches to amendModule for ModuleOp, enabling module metadata. This PR also adds support to emit AMDGPU-specific module flags amdhsa_code_object_version and amdgpu_printf_kind to match OGCG behavior. In CIRGenModule, the flags are stored as CIR module attributes: cir.amdhsa_code_object_version (integer) cir.amdgpu_printf_kind (string: "hostcall" or "buffered") During lowering to LLVM IR (in LowerToLLVMIR.cpp), these attributes are converted to LLVM module flags. Upstreaming basic changes from clangIR PRs: llvm/clangir@61e9ebd llvm/clangir#768 llvm/clangir#773 llvm/clangir#2100
…lvm#187627) Part 2/4: Implement HALO for coroutines that flow off final suspend. Parent PR approved in llvm#185336, with no change since then Since `coro.id` is unavailable in resumers, Elide `coro.free` based on frame instead of `coro.id`
The fallback non-canonicalize path didn't work. Use a more straightforward implementation. Eventually this should use the pattern from llvm#172998
This eliminates duplicated epilog code. The unused half optimizes out just fine after inlining.
I was very puzzled the other day when it showed that VF 8 had a cost of X and VF 16 had a cost of X/2, yet it still choose VF 8. This PR adds some extra debug output to explain why this happens.
These were originally ported from rocm device libs in bc81ebe. Merge in more recent changes.
…llvm#187445) The previous value of 0 was allowing loads to move past the mops operations where it is not valid. Use a LocationSize::afterPointer() size instead. The GISel lowering currently loses the MMO, which is fine as it should be conservatively treated as a load/store to any location.
Follow the ordinary gentype conventions for the log implementation, instead of using a plain header. This doesn't quite yet enable vectorization, due to how the table is currently indexed. This should make it easier for targets to selectively overload the function for a subset of types.
…87538) This is pretty verbose and ugly. We're pulling the base implementation in for the double cases, and scalarizing it. Also fully defining the half and float cases to directly use the intrinsic, for all vector types. It would be much more convenient if we had linker based overrides for the generic implementations, rather than per source file.
…llvm#187570) This is to help with llvm#185382 and to make sure that I don't miss any PRs.
…#187462) This will allow us to more conveniently use llvm::formatv in the codebase.
This patch adds a Clang-compatible -mtune option to llc, to enable decoupled ISA and microarchitecture targeting, which is especially important for backend development. For example, it can enable to easily test a subtarget feature or scheduling model effects on codegen across a variaty of workloads on the IR corpus benchmark: https://github.com/dtcxzyw/llvm-codegen-benchmark. The implementation adds an isolated generic codegen flag, to establish a base for wider usage - the plan is to add it to `opt` as well in a followup patch. Then `llc` consumes it, and sets `tune-cpu` attributes for functions, which are further consumed by the backend.
…e archives (llvm#187113) Add checksum verification for libxml2, zlib, and zstd source archives via `cmake -E *sum` and `cmake -E compare_files` commands. This also adds the following minor changes: * Factor out libxml2 version into variable. * Check `tar` return code.
Bitcast the large scalar integer to a vXi64 vector, reverse the elements and then perform a per-element vXi64 bitreverse If we have SSSE3 or later, BITREVERSE expansion using PSHUFB is always more efficient than performing it as a scalar sequence (no need for mayFoldIntoVector check). Fixes llvm#187353
This pass does not actually use TargetMachine/TargetLoweringInfo.
llvm#187644) …tail storage" (llvm#187410) This reverts commit bf1db77. Avoid using an `InterpFrame` member after calling its destructor this time. I hope that was the only problem.
This updates `matchExtendedReductionOperand` so the simple case of `UpdateR(PrevValue, ext(...))` is matched first as an early exit. The binop matching is then flattened to remove the extra layer of the `MatchExtends` lambda.
"Effective" is the wrong word: Both overloads are effective; they do what they're supposed to do. But the character overload does less work.
…sics (llvm#187513) Previously, GlobalISel was failing to select these intrinsics when given scalar operands, as RegBankSelect would place these on GPR banks. Fixing this enables GlobalISel to lower correctly, as in Instruction Selection the intrinsic matches the SIMD patterns in AArch64InstrInfo.td.
Async operations transfer data between global memory and LDS. Their progress is tracked by the ASYNC_CNT counter on GFX1250 and later architectures. This change introduces the representation of that counter in SIInsertWaitCnts. For now, the programmer must manually insert s_wait_asyncnt instructions. Later changes will add compiler assistance for generating the waits by including this counter in the asyncmark instructions. Assisted-by: Claude Sonnet 4.5 This is part of a stack: - llvm#185813 - llvm#185810
…ns (llvm#187241) SPIR-V backend previously only supported function annotations in llvm.global.annotations and crashed with a fatal error when encountering global variable entries
…lvm#187483) Most loop transformations, like unrolling and vectorization, expect the latch branch to be countable. Allow rotation, if it turns the latch from uncountable to countable. This use SCEV to check for countable exits, if CheckExitCount set. Currently it is not set for the LPM1 run (where SCEV is not used by other passes), only in LPM. With that compile-time impact is mostly neutral https://llvm-compile-time-tracker.com/compare.php?from=eba342d0ba930a404a026c80aada51c43974f0db&to=2e676337b45fae63ce9498116d8e6e43772363c5&stat=instructions:u ClamAV is consistently slower (~+0.15%) and 7zip faster in most cases (~-0.13%) Across a large test set based on C/C++ workloads, this rotates ~0.8% more loops with ~2.68M rotated loops. For the test set, ~2.7% more loops are runtime-unrolled and +6.36% more early exit loops vectorized on ARM64 macOS. This fixes a regression where std::ranges::find_last loops stopped being runtime-unrolled after llvm@5f648c3 which changed the loop structure so we stopped rotating. https://clang.godbolt.org/z/6baeE1av6 Based on llvm#162654. Co-authored-by: Marek Sedláček <mr.mareksedlacek@gmail.com> PR: llvm#187483
…nd whitespace handling (llvm#186950) The `check_alphabetical_order.py` script previously only scanned the first line of each bullet point in `ReleaseNotes.rst`, causing sorting failures when a `:doc:` tag was split across multiple lines. Also, when it is sorting the last entry of a section, the script will insert an unnecessary whitespace. This PR fixes these two problems.
…lvm#187020) The `CallEvent` has data members that store the `LocationContext` and the `CFGElementRef` (i.e. `CFGBlock` + index of statement within that block); but the method `getReturnValueUnderConstruction` ignored these and used the currently analyzed `LocationContext` and `CFGBlock` instead of them. This was logically incorrect and would have caused problems if the `CallEvent` was used later when the "currently analyzed" things are different. However, the lit tests do pass even if I assert that the currently analyzed `LocationContext` and `CFGBlock` is the same as the ones saved in the `CallEvent`, so I'm pretty sure that there was no actual problem caused by this bad logic and this commit won't cause functional changes. I also evaluated this change on a set of open source projects (postgres, tinyxml2, libwebm, xerces, bitcoin, protobuf, qtbase, contour, openrct2) and validated that it doesn't change the results of the analysis.
…lock sub (llvm#184715) Resolves: llvm#174933 The issue goes into a case where fetch_sub(n) is properly optimized but fetch_add(neg(n)) is not optimized to the same code. Although the issue is tagged for x86 I assumed this be best handled outside of the backends so I put this in InstCombine.
Just use an empty list always.
Inline the definition of a variable into an assertion given it has no other users and no side effects.
Lay the ground for C++26 `constexpr` math functions: - Introduce `LIBC_ENABLE_CONSTEXPR` macro switch to specify the desire of `constexpr`-only code route. - Introduce `LIBC_HAS_CONSTANT_EVALUATION` to indicate that we are using `constexpr`-only code in all dependent functions. - Introduce `LIBC_CONSTEXPR` macro qualifier to aid in altering the signature of non-`constexpr` functions. Note that non-`constexpr` qualified functions are caused by the exploitation of non-`constexpr` compatible utils, resulting in non-qualified dependent function, but it can be modified to be qualified using other code routes. If the function is `constexpr` compatible, then it's prohibited to use `LIBC_CONSTEXPR` as a function qualifier. We only qualify it with `constexpr` as usual. `LIBC_CONSTEXPR` may or may not evaluate to `constexpr` depending on the environment configurations, thus it's only used to modify the function signature in constant evaluation context and remove the qualifier if it's not desired (depending on provided configurations). Possible side effects: - Current qualified routes may or may not produce the desired ULP, this is implementation dependent (function by function basis) and needs further testing of the chosen code route. - The shared tests in the current configuration can still compile with unsupported compiler. I didn't want to raise compilation error with unsupported compilers now, but we need to push compiler support with newer versions for this one to work as intended.
…/invalid-vop3-source-modifiers.mir (llvm#187888)
Use (nearly) the same code to align case statements and expression, as the other alignments do. That way we also fix two things: - Keep the ColumnLimit intact, without duplicating the calculation. - Align all the case colons, even for empty cases.
The new code introduced for `-verify-directives` in PR llvm#179835 enforces that the order of diagnostics matches the order of the directives. However, before checking this, it sorts the directives by SourceLocation. Perhaps non-obviously, all directives which appear inside a single comment are given the same SourceLocation, pointing to the beginning of the comment. While these are added in order they appear in the comment, the non-stable std::sort may non-detministically misorder them. Switching to stable_sort ensures the correct order is verified. This was observed as a random test failure on the checks in clang/test/CXX/drs/cwg25xx.cpp lines 250 and 264, in some builds of Clang. Note that those lines end in backslashes, and thus, despite appearances, the directives on the following lines are also within the same single comment.
…7640) Session::tryCreateService will try to create an instance of ServiceT by forwarding the given arguments to the ServiceT::Create method, which must return an Expected<std::unique_ptr<ServiceT>>. This enables one-line construction and registration of Services with fallible constructors (which are expected to be common).
…lvm#187826) Use FieldDecl::getFieldIndex() instead of manually iterating over fields.
Adds optional attribute to allow specialization into category Linalg ops. The default behavior of the transform op remains unchanged.
We can use the negate if carry trick for abdu, and it works on all legal for sbb
…87508) Fixes llvm#187012 which is a false positive on clang-tidy end.
…i/gcs=always (llvm#186343) Previously, the implicit warnings from force-bti (or gcs=always) weren't possible to silence. The force-ibt/cet-report flags could also be handled the same way, but I haven't checked with GNU ld how they behave. And there, the force-ibt flag only produces warnings if the IBT bit is missing, while cet-report warns if either IBT or SHSTK are missing - but force-ibt probably shouldn't implicitly start warning for missing SHSTK. This addresses a discrepancy to GNU ld that was noted in llvm#186173.
Since befaa35, the CI stably failed for the generic-no-wide-characters build, because in no-`wchar_t` modes, the header for `__remove_cv_t` wasn't properly included. This PR adds the missing include of `<__type_traits/remove_cv.h>`. As drive-by, `<__cstddef/size_t.h>` and `<__type_traits/is_constant_evaluated.h>`, which are included by `<cwchar>`, are also made included by `<string>` to avoid potential regression as we're using `size_t` and `__libcpp_is_constant_evaluated()` in `<string>`.
Collaborator
Author
|
PSDB Build Link: http://mlse-bdc-20dd129:8065/#/builders/11/builds/72 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.