This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][IRBuilder] `omp task` support
ClosedPublic

Authored by shraiysh on Dec 29 2019, 11:20 PM.

Details

Summary

This patch adds basic support for omp task to the OpenMPIRBuilder.

The outlined function after code extraction is called from a wrapper function with appropriate arguments. This wrapper function is passed to the runtime calls for task allocation.

This approach is different from the Clang approach - clang directly emits the runtime call to the outlined function. The outlining utility (OutlineInfo) simply outlines the code and generates a function call to the outlined function. After the function has been generated by the outlining utility, there is no easy way to alter the function arguments without meddling with the outlining itself. Hence the wrapper function approach is taken.

Diff Detail

Event Timeline

jdoerfert created this revision.Dec 29 2019, 11:20 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 29 2019, 11:20 PM
lebedev.ri added inline comments.
llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
141–142

This doesn't seem correctt.

Unit tests: unknown.

clang-tidy: unknown.

clang-format: unknown.

Build artifacts: diff.json, console-log.txt

openmp/runtime/src/kmp.h
2242 ↗(On Diff #235555)

Existing code, but the comments about number of bits here should probably be executable. E.g. give the compiler and the library a uint16_t field each, assert that sizeof kmp_tasking_flags == sizeof(uint32_t). Optionally four byte align the first short.

jdoerfert marked 2 inline comments as done.Dec 30 2019, 9:55 AM
jdoerfert added inline comments.
llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
141–142

It is not, thx, 62.

openmp/runtime/src/kmp.h
2242 ↗(On Diff #235555)

I'd like to make a bunch of changes like that now that I started to look into the library closer. In addition to the TODOs I added below (in another patch), I found seemingly unused functions, arguments, ... Versioning the runtime seems interesting to me now.

rogfer01 added inline comments.Jan 13 2020, 11:43 PM
openmp/runtime/src/kmp_tasking.cpp
1764 ↗(On Diff #235555)

Just double-checking here I understand what would happen here for C++ firstprivatized objects (sorry if I'm asking the obvious):

We would capture the copy onto an alloced memory and then we would copy that memory to the task struct?

So given something like

struct A {
  A(int x = 3);
  void bar();
};

void foo() {
  A a;
  #pragma omp task
  {
    a.bar();
  }
}

we'd do (in pseudo-C)

void foo() {
   struct task_env_0 { 
      char A_storage[sizeof(A)]; // properly aligned and all that
   } tenv_0;

   A::A(&tenv_0.A_storage, 3); // capture

   // This would happen (dynamically) inside __kmpc_task
   kmp_task_t* new_task = __kmp_omp_task_alloc( , sizeof(tenv_0), ... );
   memcpy(new_task->shareds, &tenv_0, sizeof(tenv_0));
   kmpc_omp_task(..., new_task).
   //
}

So we go from

  • create task
  • capture environment in the task context
  • queue task / run immediately for if0

to

  • capture environment in a local storage
  • allocate task + copy local environment to task environment if needed (for tasks that are not if0) + queue task / run immediately for if0

Did I get that right?

Thanks!

the task create and task issue step are
conceptually not separated anymore as it is

I don't think this can work reliably. Because not all C++ objects can be mem-copied.
E.g. an object can keep its own address or reference, and mem-copy will make it broken.
This could be fixed by generating (optional) thunk routine which would create all needed objects
in the library-allocated space, and similar routine which would destroy all the created objects.

the task create and task issue step are
conceptually not separated anymore as it is

I don't think this can work reliably. Because not all C++ objects can be mem-copied.
E.g. an object can keep its own address or reference, and mem-copy will make it broken.
This could be fixed by generating (optional) thunk routine which would create all needed objects
in the library-allocated space, and similar routine which would destroy all the created objects.

I agree. memcpy will only work for certain types (incl. all PoD types). We need to extend this later to pass the copy constructor function pointers and locations of the original object.
We should keep kmp_uint32 sizeof_shared_and_private_vars, void *shared_and_private_vars for the simple cases and add something like:

/// Copy wrapper takes the address of an object and the address the copy is going to be initialized in and returns the address right after the new object. 
typedef void*(*copy_wrapper)(void * /* src_addr */, void * /* trg_addr */);
 
__kmpc_task( ..., kmp_uint32 sizeof_copy_infos, copy_wrapper * copy_wrapper_list, void * copied_obj_list, ...)

and we then call the copy constructors like this:

void *local_addr = ...;
for (kmp_uint32 u = 0; u < sizeof_copy_infos; ++u)
  local_addr = copy_wrapper_list[u](copied_obj_list[u], local_addr);

@rogfer01 @AndreyChurbanov What do you think? (I can do this in this patch or later)

and we then call the copy constructors like this:

void *local_addr = ...;
for (kmp_uint32 u = 0; u < sizeof_copy_infos; ++u)
  local_addr = copy_wrapper_list[u](copied_obj_list[u], local_addr);

Just to confirm: that local_addr would be somehow linked to the task, I imagine it'd be initialized to something like task->shareds + offset_to_firstprivates, wouldn't it?

Also, perhaps you already considered if it makes sense to just have a copy function generated by the front-end rather than one of for each "firstprivatized variable that can't just be memcpy'ed"?

Also I'm curious why are you moving away (perhaps I did get this wrong!) from the current model of

  1. kmpc_omp_task_alloc
  2. capture environment
  3. queue the task kmpc_omp_taskor (if the task is if(0)) do an "immediate" execution but use the environment captured in 2

to something that looks like

  1. (partially?) capture the environment. I think I didn't understand what void *shared_and_private_vars will do here. Is this something that the front-end precaptured for us?
  2. copy the environment you obtained from shared_and_private_vars to a task-local storage and then queue the task or (if the task is if(0)) do an immediate execution using the environment you got from the argument to kmpc_task

I guess there is a benefit in the new approach. So far I read this as improving the if(0) case (but I may be missing something here).

Thanks!

and we then call the copy constructors like this:

void *local_addr = ...;
for (kmp_uint32 u = 0; u < sizeof_copy_infos; ++u)
  local_addr = copy_wrapper_list[u](copied_obj_list[u], local_addr);

Just to confirm: that local_addr would be somehow linked to the task, I imagine it'd be initialized to something like task->shareds + offset_to_firstprivates, wouldn't it?

Yes. It is some location in which the task local variables live.

Also, perhaps you already considered if it makes sense to just have a copy function generated by the front-end rather than one of for each "firstprivatized variable that can't just be memcpy'ed"?

Having a copy function per type allows us to reuse it, otherwise we have one copy function per static task location (at worst). Either works for me I think.

Also I'm curious why are you moving away (perhaps I did get this wrong!) from the current model of

  1. kmpc_omp_task_alloc
  2. capture environment
  3. queue the task kmpc_omp_taskor (if the task is if(0)) do an "immediate" execution but use the environment captured in 2

to something that looks like

  1. (partially?) capture the environment. I think I didn't understand what void *shared_and_private_vars will do here. Is this something that the front-end precaptured for us?

The idea is that we can allocate variables (for which we do not invoke a copy constructor) in a smart way so that we can easily copy them if we need to or not copy them at all if we don't. This interface is a first version though roughly designed like the tregion interface. Any suggestions are welcome!

  1. copy the environment you obtained from shared_and_private_vars to a task-local storage and then queue the task or (if the task is if(0)) do an immediate execution using the environment you got from the argument to kmpc_task

Optimally, you would only copy if you need to.

I guess there is a benefit in the new approach. So far I read this as improving the if(0) case (but I may be missing something here).

There are multiple reasons but the main one for me is for sure the ability to use callback metadata to link the values passed at the call site of kmpc_task with the values received at the outlined body function. This is really complicated in the old approach, because there is no direct link possible between the captured values and the local copies in the thread (the stores go to some really hard to describe location and the body function reference is only at some point later).
In the new approach you link shared_and_private_vars to the argument the task function receives. Now you encapsulate the call site in an additional, modifiable level of indirection (see D71505), and the Attributor will unpack the struct (see D68852), propagate constants, alias information, ... between the task invocation and the task function.

Having a copy function per type allows us to reuse it, otherwise we have one copy function per static task location (at worst). Either works for me I think.

I also would be OK with either option.
But note that using per-type functions will require more additions to the interface, like:

(..., num_objects, array_of_copy_wrappers, array_of_desctuctor_wrappers, array_of_obj_offsets).

Then the library can iterate over objects to copy-construct them, then iterate to destroy them after the task is complete. Without any possibility of inlining of any wrappers.

Per-task function only needs two additions - copy_wrapper and destructor_wrapper, all other details can live inside them, including possible inlining of constructors and destructors.

Having a copy function per type allows us to reuse it, otherwise we have one copy function per static task location (at worst). Either works for me I think.

I also would be OK with either option.
But note that using per-type functions will require more additions to the interface, like:

(..., num_objects, array_of_copy_wrappers, array_of_desctuctor_wrappers, array_of_obj_offsets).

Then the library can iterate over objects to copy-construct them, then iterate to destroy them after the task is complete. Without any possibility of inlining of any wrappers.

Per-task function only needs two additions - copy_wrapper and destructor_wrapper, all other details can live inside them, including possible inlining of constructors and destructors.

Agreed. Interestingly, the "per-type" solution should allow you to implement the "per-task" version on top. You pretend all objects are part of a (task) meta-object which will copy them one by one.

@fghanim @kiranchandramohan @kiranktp @AMDChirag @anchu-rajendran @SouraVX @Meinersbur, Anyone interested in taking this over? The required changes in addition to the diff have been discussed with @AndreyChurbanov in the comments above.

openmp/runtime/src/kmp_tasking.cpp
1764 ↗(On Diff #235555)

Did I get that right?

Yes, that was the idea.

(Apologies for the delayed response)

The required changes in addition to the diff have been discussed with @AndreyChurbanov in the comments above.

Actually the __kmpc_task function is going to be a bit more complicated in reality.

Because besides parameters for copy-construction + destruction of firstprivate objects, one will also need to pass to the __kmpc_task:

  • priority value if any;
  • affinity value(s) if any, probably single value is enough (this may be not supported yet, but apparently will be supported);
  • address of event handle to be returned for detachable task, offset of the event handle in shared_and_private_vars so that the library can put corresponding address there if it is used inside the task code, or some indicator that library should not put the address there if it is not used inside the task code (e.g. negative offset?);
  • offset of address of task structure in shared_and_private_vars for untied task, so that the compiler can generate __kmpc_omp_task calls to re-schedule partially executed task.

Also the __kmpc_task should better not call any __kmpc_* functions given recent activity of Joachim Protze (D92197) to eliminate all __kmpc_* calls from inside the library, and replace them with corresponding internal calls. Otherwise OMPT cannot reliably determine entry to / exit from the runtime library, IINM. But this might be a separate patch.

Please consider the special cases of task_if0, which preferably should use __kmpc_omp_task_with_deps:

  • #pragma omp task if(0) depend(out:A) detach(event) : the detached undeferred task needs the out dependency to release the dependency only when detach is fulfilled
  • #pragma omp task if(0) depend(inoutmutexset:A) : the undeferred task can execute mutually exclusive with any other inoutmutexset:A task.

I'm not sure that these are all the special cases.

See the bugs for full OpenMP code examples:

https://bugs.llvm.org/show_bug.cgi?id=46185
https://bugs.llvm.org/show_bug.cgi?id=46193

Hi @jdoerfert, we (BSC) may be able to work on this but we don't want to step on each one toes. Are there plans to push this forward (by you or someone else)?

How essential is changing the task interface in the context of your vision for the OpenMP-wise optimisations? Reusing the existing interface may not be ideal in that sense but could allow us to have a baseline already working for flang.

Thoughts? Thanks!

Hi @jdoerfert, we (BSC) may be able to work on this but we don't want to step on each one toes. Are there plans to push this forward (by you or someone else)?

How essential is changing the task interface in the context of your vision for the OpenMP-wise optimisations? Reusing the existing interface may not be ideal in that sense but could allow us to have a baseline already working for flang.

Thoughts? Thanks!

I am not actively working on this, @ggeorgakoudis was interested though. We can communicate via email what is a good way forward.
I am fine with keeping the original interface first and just moving the code, if you think that is easier.
(I was hoping that API simplifications will make the move easier but I might have been wrong.)

Hi all, is this patch being worked on? I wanted to use this for adding support for task construct in flang.

Herald added a project: Restricted Project. · View Herald TranscriptMar 28 2022, 10:19 PM

Hi all, is this patch being worked on? I wanted to use this for adding support for task construct in flang.

I don't think it is. Read the discussion to see why this got stalled. Best way forward is to take this but *not* do the API changes I included.

Hi all, is this patch being worked on? I wanted to use this for adding support for task construct in flang.

I don't think it is. Read the discussion to see why this got stalled. Best way forward is to take this but *not* do the API changes I included.

Okay, I am assuming a lot of the OpenMPIRBuilder.cpp code can be reused from here, but instead of creating the new proposed function, I should create the old functions with those arguments. I hope this is the correct approach, please correct me if I am wrong.

I wanted to know how to proceed about this? Do I borrow stuff from this patch and submit a new differential with you as a co-author? Do I push to this same differential? (Can we do that?) Please advise me on this. Thanks!

I wanted to know how to proceed about this? Do I borrow stuff from this patch and submit a new differential with you as a co-author? Do I push to this same differential? (Can we do that?) Please advise me on this. Thanks!

There is a "Commandeer Revision" option in the Add Action Menu. You can use that to take over this patch from @jdoerfert. Continuing here might have some advantages in the context, existing reviews and reviewers etc.

shraiysh commandeered this revision.Mar 30 2022, 10:55 AM
shraiysh added a reviewer: jdoerfert.

Commandeering this revision now.

shraiysh updated this revision to Diff 421309.Apr 7 2022, 12:13 PM

Update with basic support for createTask without the clauses.

shraiysh retitled this revision from [OpenMP][IRBuilder][WIP] Prototype `omp task` support to [OpenMP][IRBuilder] `omp task` support.Apr 7 2022, 12:16 PM
shraiysh edited the summary of this revision. (Show Details)
shraiysh updated this revision to Diff 421311.Apr 7 2022, 12:18 PM

Rerun builds without dependency

shraiysh updated this revision to Diff 422405.Apr 12 2022, 11:51 PM

Use correct method - getTypeStoreSize instead of getTypeSizeInBits.

Also, ping for review.

shraiysh updated this revision to Diff 423273.Apr 16 2022, 9:00 PM

Handle the case with no arguments to task function.

Herald added a project: Restricted Project. · View Herald Transcript
shraiysh updated this revision to Diff 423274.Apr 16 2022, 9:06 PM

Remove MLIR related code from this patch. (Added by mistake)

Thanks @shraiysh for taking up the work for task. This is the most important pending work for non-target OpenMP.

Could you expand the Summary a bit more with the approach taken and how it differs from current Clang task codegen?

I have added a few questions and comments.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1290

Do you know whether attributes are added to the outlined/wrapper function? Is it added in the finalize function? CreateParallel seems to have in in the PostOutlineCB.

1302–1307

Is this a change from clang codegen? Why not runtime_call(..., outlined_fn, ...)? Can you say that explicitly in the summary that this is different from Clang if it was changed for a particular reason?

1335

Nit: corresponding

1365

Is the size of shared a TODO? And would further changes impact the task size as well?
It seems by default variables are passed as private copies, but if the enclosing region has a data-sharing of shared then that should be honoured.

1391

Nit: A debug dump of the IR here might be useful.

1393–1394

Nit: The size can be omitted.

1395

Is this step required? TaskRegionBlockSet and Blocks do not seem to be used here, also collectBlocks function seems to be called from finalize as well.

FYI, I am revising how outlining works, still waiting on D115216 getting reviewed.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1313

Could you add an assert checking that there is just a single user, so we do not accidentally a wrong one.

1391

Please don't dump complete IR. Too much output makes -debug useless.

llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
1196–1198

[nit] only whitespace change

4452

[style] LLVM codsing style does not use Almost-Always-Auto.

4516–4521

Consider llvm::any_of

shraiysh edited the summary of this revision. (Show Details)Apr 26 2022, 4:37 AM
shraiysh updated this revision to Diff 425189.Apr 26 2022, 5:18 AM
shraiysh marked 11 inline comments as done.

Addressed comments

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1290

I have not added the attributes because I don't know about them and if they are required or not. I will check clang codegen once again for comments about any required attributes, but if you or anyone else knows what attributes are required and why, please let me know, I will add them.

1302–1307

The main reason for a difference is that Outlining generates a function call as shown in the input IR in the comment above. I could not find an easy way to change the arguments of a function after it has been constructed. Parallel does this by tweaking the CodeExtractor used inside outlining - and I did not want to touch outlining internally so that I can confidently rely on outlining to do its job.

A nice way to have similar (but not same) behavior as clang would be to add the inline attribute to the outlined function. After an Inline pass, the wrapper function will become the outlined function and it will be almost identical to clang's codegen. I will update the summary to highlight this difference.

1365

Yes, that is a todo, I have added it now. I think the task size will be reduced in that case. I have not looked into that yet and am doing private copies for everything at the moment.

1391

Please let me know if something less than IR should be dumped here.

1395

No it is not, I thought collectBlocks was required. I have removed it now.

llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
1196–1198

Removed. Thanks!

Meinersbur added inline comments.Apr 26 2022, 9:05 AM
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1265–1275

I don't understand what this means.

1276

[style] Please use LLVM's naming standard.

1288

BodyGenCB may have invalidated AllocaIP and cannot be used anymore.

1302–1307

Could you add the function signature differences between wrapper_fn and outlined_fn? I.e. what arguments does wrapper_fn throw away?

Btw, current Clang emits such a wrapper function with -g as well, so I don't see it as an issue. LLVM will always inline a private function with just a single call site, except in -O0 where always_inline would be needed.

1333

Please document what HasTaskData. TaskSize, NewTaskData, etc are.

1353

Use a llvm::Twine

1368

What is flag 1?

1387–1388
1399

Please avoid temporary instructions. Use splitBB instead.

shraiysh updated this revision to Diff 426227.Apr 30 2022, 6:28 AM
shraiysh marked 7 inline comments as done.

Addressed comments. Rebase with main.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1265–1275

There are three basic block splits in the next few lines and this comment is meant to justify why three splits are needed - it basically tells which basic block will go where. Should I add something more to the comment?

1302–1307

The first argument is an i32 value but I do not know its usage. There is minimal documentation for task related stuff in OpenMP Runtime Library Reference in the OpenMP subproject. Here is the function signature for the wrapper function, and the first argument is discarded at the moment. If anyone has any reference to some documentation about this, or has any idea about what that argument represents, please let me know and I will try to handle it according to its purpose.

Btw, current Clang emits such a wrapper function with -g as well, so I don't see it as an issue. LLVM will always inline a private function with just a single call site, except in -O0 where always_inline would be needed.

Oh that is great then. Thank you!

1302–1307

Could you add the function signature differences between wrapper_fn and outlined_fn? I.e. what arguments does wrapper_fn throw away?

Btw, current Clang emits such a wrapper function with -g as well, so I don't see it as an issue. LLVM will always inline a private function with just a single call site, except in -O0 where always_inline would be needed.

1353

I have used it, but I am not sure if it is accurate usage. Please let me know if it seems inefficient.

1368

This is from flags in kmp.h. Value 1 means that it is tied (which is default unless untied clause is specified). Untied clause is not handled in this patch.

1399

AFAIK, splitBB requires an instruction pointer. I have updated this to erase the temporary instruction immediately after I have split the basic blocks, but I cannot figure out a way to completely eliminate temporary instructions. Is this okay?

Meinersbur added inline comments.May 2 2022, 4:51 PM
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1352

Don't store a Twine in a variable. See comment from Twine.h:

/// A Twine is not intended for use directly and should not be stored, its
/// implementation relies on the ability to store pointers to temporary stack
/// objects which may be deallocated at the end of a statement. Twines should
/// only be used accepted as const references in arguments, when an API wishes
/// to accept possibly-concatenated strings.
1355

No SmallString<128> needed. str() creates a std::string that implicitly converts to a llvm::StringRef that is valid until the end of the statement(";").

Compared to using std::string only, this saves the creation of one temporary std::sting (for OutlinedFn.getName()). Your version saves another one (if fewer than 128 chars), but it is also more complicated. Before StringRef was made more compatible to std::string_view, the .str() wasn't even need.

1399
BasicBlock *TaskExitBB = splitBB(Builder, "task.exit");
BasicBlock *TaskBodyBB = splitBB(Builder, "task.body");
BasicBlock *TaskAllocaBB = splitBB(Builder, "task.alloca");

Note that the reverse order. After this, Builder will insert instructions before the terminator of Currbb (where you probably don't want to insert instructions here).

llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
4501

If WrapperFunc is NULL, the next line will be a segfault. ASSERT_NE will stop execution if it fails.

4504

See above (and other occurances checking for nullptr)

4511–4514

Nice!

shraiysh updated this revision to Diff 426608.May 3 2022, 2:18 AM
shraiysh marked 10 inline comments as done.

Addressed comments

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1355

Alright, thanks for the explanation, I understand it better now.

Meinersbur added inline comments.May 3 2022, 8:31 PM
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1265–1275

Consider something that resembles LLVM-IR syntax. Eg.

// def outlined_fn() {
//   task.alloca:
//     br label %task.body
//
//   task.body:
//     ret void
// }

outlined_fn is a view after the finalize call, but here we are constructing the CFG before outlining, i.e. the description does not match.

1324–1326

[serious] SrcLocStrSize cannot be passed-by-value and passed-bt-referenced at the same statement (unsequenced).

1333

Still don't see a description of TaskSize.

For HasTaskData, could you explain why and how the function signature changes?

1351–1352

WrapperFuncName and WrapperFuncNameStorage are now dead. Could you remove them?

1368

Please document that.

llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
4538

[style]

4539

[style]

shraiysh updated this revision to Diff 428285.May 9 2022, 9:43 PM
shraiysh marked 5 inline comments as done.

Addressed comments.

shraiysh updated this revision to Diff 428286.May 9 2022, 9:45 PM

Added description for Tied Argument

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1265–1275

outlined_fn is a view after the finalize call, but here we are constructing the CFG before outlining, i.e. the description does not match.

Yes, that was supposed to justify why we need four splits here. Without this comment it wasn't really clear to me why three blocks won't be enough (currBB, task.body and task.exit). It also isn't directly intuitive that task.exit is not going to be a part of the outlined function. I have added that this is going to be the basic block mapping after outlining. If it seems unnecessary, please let me know and I will remove this comment.

1333

I have added that TaskSize refers to the argument sizeof_kmp_task_t in the runtime call. I did not want to add further information because the argument's exact meaning is not documented anywhere (the reference doesn't have it). My interpretation of the argument is the size of arguments in bytes, but that could be incorrect and I thought it was better to redirect anyone reading this/working on this to the argument of the runtime call instead of writing a possibly misleading description. Please let me know if it would be better to add the current interpretation of the argument.

For HasTaskData, could you explain why and how the function signature changes?

Documented this near the wrapper function. Please let me know if it requires some alteration.

Ping for review.

Meinersbur accepted this revision.May 19 2022, 8:21 AM

LGTM. Consider adding some description of TaskSize before committing.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1333

https://github.com/llvm/llvm-project/blob/a4190037fac06c2b0cc71b8bb90de9e6b570ebb5/openmp/runtime/src/kmp_tasking.cpp#L1345

// sizeof_kmp_task_t:  Size in bytes of kmp_task_t data structure including
// private vars accessed in task.
This revision is now accepted and ready to land.May 19 2022, 8:21 AM
shraiysh updated this revision to Diff 431241.May 22 2022, 9:08 AM
shraiysh marked 3 inline comments as done.

Thank you @Meinersbur for pointing out the description of the argument, I had not checked the cpp file. I have added the description now.

Addressed comments. I will wait for two more days, and if there are no further comments on this patch, then I will land it.

Meinersbur accepted this revision.May 23 2022, 1:48 PM
This revision was landed with ongoing or failed builds.May 23 2022, 9:52 PM
This revision was automatically updated to reflect the committed changes.
mikerice added inline comments.
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1369

The return value of this dyn_cast is not checked. Should this use cast instead (to satisfy static verifiers)? Or do you expect a nullptr return here?

shraiysh added inline comments.Oct 13 2022, 11:29 AM
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
1369

I do not expect it to return a nullptr in most cases. There could be a few corner cases that I am not aware of where it might.

I think we can change it to cast instead of dyn_cast, or if that doesn't work, we should error out if we get a nullptr, until we find a valid testcase where nullptr is expected. I am busy for the next couple days so I will work on this after sometime, but meanwhile if this change turns any buildbots green and is required sooner, I will be able to review the change if required. Thank you for pointing this issue out.