Commit Graph

81 Commits

Author SHA1 Message Date
Matt Arsenault
9ec3cf2c8a Move ExtractVectorElements to SelectionDAG.
This seems generally useful, and makes sense to
go along with SplitVector.

llvm-svn: 206041
2014-04-11 17:47:30 +00:00
Tom Stellard
50122a5890 R600: Match 24-bit arithmetic patterns in a Target DAGCombine
Moving these patterns from TableGen files to PerformDAGCombine()
should allow us to generate better code by eliminating unnecessary
shifts and extensions earlier.

This also fixes a bug where the MAD pattern was calling
SimplifyDemandedBits with a 24-bit mask on the first operand
even when the full pattern wasn't being matched.  This occasionally
resulted in some instructions being incorrectly deleted from the
program.

v2:
  - Fix bug with 64-bit mul

llvm-svn: 205731
2014-04-07 19:45:41 +00:00
Matt Arsenault
7939acd7fa Use .data() instead of &x[0]
llvm-svn: 205722
2014-04-07 16:44:24 +00:00
Matt Arsenault
378bf9c68b R600: Compute masked bits for min and max
llvm-svn: 205242
2014-03-31 19:35:33 +00:00
Matt Arsenault
4c53717787 R600: Add BFE, BFI, and BFM intrinsics to help with writing tests.
llvm-svn: 205236
2014-03-31 18:21:18 +00:00
Matt Arsenault
b34583661b R600: Add target nodes for BFM and BFI
llvm-svn: 205235
2014-03-31 18:21:13 +00:00
Matt Arsenault
b517c8128e R600: Implement isZExtFree.
This allows 64-bit operations that are truncated to be reduced
to 32-bit ones.

llvm-svn: 204946
2014-03-27 17:23:31 +00:00
Matt Arsenault
d125d74a73 R600/SI: Fix unreachable with a sext_in_reg to an illegal type.
llvm-svn: 204945
2014-03-27 17:23:24 +00:00
Matt Arsenault
90b733a3cf R600: Add a testcase for sext_in_reg I missed.
This sext_inreg i32 in i64 case was already handled, but not enabled.

llvm-svn: 204840
2014-03-26 18:31:06 +00:00
Matt Arsenault
0c274feedf R600: Move computeMaskedBitsForTargetNode out of AMDILISelLowering.cpp
Remove handling of select_cc, since it makes no sense to be there. This
now does nothing, but I'll be adding some handling of other target nodes
soon.

llvm-svn: 204743
2014-03-25 18:18:27 +00:00
Matt Arsenault
a7f1e0c44f R600: Implement isNarrowingProfitable.
llvm-svn: 204658
2014-03-24 19:43:31 +00:00
Matt Arsenault
fae02989b7 R600: Match sign_extend_inreg to BFE instructions
llvm-svn: 204072
2014-03-17 18:58:11 +00:00
Matt Arsenault
ea330fbe49 R600: Remove unnecessary attempt to zext a pointer.
Private pointers are now always 32-bits.

llvm-svn: 203989
2014-03-15 00:08:26 +00:00
Matt Arsenault
74891cdefe R600: Code cleanup.
Use sign_extend_inreg and getZeroExtendInReg instead of
using the bit operations they expand into.

llvm-svn: 203988
2014-03-15 00:08:22 +00:00
Matt Arsenault
e389dd5d68 R600: Fix trunc store from i64 to i1
llvm-svn: 203695
2014-03-12 18:45:52 +00:00
Matt Arsenault
0211714ecb R600: Calculate store mask instead of using switch.
llvm-svn: 203527
2014-03-11 01:38:53 +00:00
Matt Arsenault
9504d2f269 Use .data() instead of &x[0]
llvm-svn: 203516
2014-03-11 00:01:31 +00:00
Matt Arsenault
f9a995d68c R600: Fix extloads from i8 / i16 to i64.
This appears to only be working for global loads. Private
and local break for other reasons.

llvm-svn: 203135
2014-03-06 17:34:12 +00:00
Matt Arsenault
9fe669c522 R600/SI: Expand selects on vectors.
llvm-svn: 203134
2014-03-06 17:34:03 +00:00
Matt Arsenault
ca6dcfcf59 Fix typo
llvm-svn: 203013
2014-03-05 21:47:22 +00:00
Matt Arsenault
41e2f2bacd R600/SI - Add new CI arithmetic instructions.
Does not yet include larger part required
to match v_mad_i64_i32 / v_mad_u64_u32.

llvm-svn: 202077
2014-02-24 21:01:28 +00:00
Matt Arsenault
21a3faaf25 Fix DOT4 missing from getTargetOpcodeName
llvm-svn: 202075
2014-02-24 21:01:21 +00:00
Tom Stellard
967bf5813f R600/SI: Expand all v8[if]32 operations
llvm-svn: 201371
2014-02-13 23:34:15 +00:00
Benjamin Kramer
53f9df4c93 R600: Always implement both versions of isTruncateFree and add a sanity check.
llvm-svn: 201222
2014-02-12 10:17:54 +00:00
Matt Arsenault
0cdcd961bf R600: Implement isTruncateFree
Truncation is just accessing a subregister for any multiple of
the register size, so it's free.

llvm-svn: 201107
2014-02-10 19:57:42 +00:00
Tom Stellard
aeb456438c R600/SI: Expand i1 BR_CC
This fixes a crashes in the OpenCV test suite and also the scrypt
kernel in bfgminer.

I was unable to come up with a reduced test case for this.

https://bugs.freedesktop.org/show_bug.cgi?id=72785

llvm-svn: 200776
2014-02-04 17:18:43 +00:00
Tom Stellard
bfebd1fc7e R600: Enable vector fpow.
The OpenCL specs say: "The vector versions of the math functions operate
component-wise. The description is per-component."

Patch by: Jan Vesely

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 200773
2014-02-04 17:18:37 +00:00
Tom Stellard
04c0e9851b R600: Add support for global addresses with constant initializers
llvm-svn: 199825
2014-01-22 19:24:21 +00:00
Tom Stellard
e93736057f R600/SI: Add support for i8 and i16 private loads/stores
llvm-svn: 199823
2014-01-22 19:24:14 +00:00
Tom Stellard
eddfa69465 R600: Allow ftrunc
v2: Add ftrunc->TRUNC pattern instead of replacing int_AMDGPU_trunc
v3: move ftrunc pattern next to TRUNC definition, it's available since R600

Patch By: Jan Vesely

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 197783
2013-12-20 05:11:55 +00:00
Matt Arsenault
52226f9a8e Don't manually calculate size in bytes
llvm-svn: 197327
2013-12-14 18:21:59 +00:00
Matt Arsenault
eaa3a7efab Use llvm_unreachable instead of assert(0)
llvm-svn: 196971
2013-12-10 21:37:42 +00:00
Tom Stellard
175e7a8c97 R600: Expand vector FABS
NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195881
2013-11-27 21:23:39 +00:00
Tom Stellard
4d566b2edf R600: Add support for ISD::FROUND
NOTE: This is a candidate for the 3.4 branch.
llvm-svn: 195878
2013-11-27 21:23:20 +00:00
Matt Arsenault
c5559bb14b Add target hook to prevent folding some bitcasted loads.
This is to avoid this transformation in some cases:
fold (conv (load x)) -> (load (conv*)x)

On architectures that don't natively support some vector
loads efficiently casting the load to a smaller vector of
larger types and loading is more efficient.

Patch by Micah Villmow.

llvm-svn: 194783
2013-11-15 04:42:23 +00:00
Tom Stellard
81d871dee3 R600/SI: Add support for private address space load/store
Private address space is emulated using the register file with
MOVRELS and MOVRELD instructions.

llvm-svn: 194626
2013-11-13 23:36:50 +00:00
Vincent Lejeune
4f3751f2af R600: Fix LowerUDIVREM
llvm-svn: 194153
2013-11-06 17:36:04 +00:00
Tom Stellard
c947d8ca64 R600: Custom lower f32 = uint_to_fp i64
llvm-svn: 193701
2013-10-30 17:22:05 +00:00
Tom Stellard
e118b8becd R600: Expand vector FSQRT ops
llvm-svn: 193620
2013-10-29 16:37:20 +00:00
Tom Stellard
af77543244 R600: Fix handling of vector kernel arguments
The SelectionDAGBuilder was promoting vector kernel arguments to legal
types, but this won't work for R600 and SI since kernel arguments are
stored in memory and can't be promoted.  In order to handle vector
arguments correctly we need to look at the original types from the LLVM IR
function.

llvm-svn: 193215
2013-10-23 00:44:32 +00:00
Tom Stellard
afcf12f33a R600/SI: expose TBUFFER_STORE_FORMAT_* for OpenGL transform feedback
For _XYZ, the type of VDATA is v4i32, because v3i32 doesn't exist.

The ADDR64 bit is not exposed. A simpler intrinsic that doesn't take
a resource descriptor might be nicer.

The maximum number of input SGPRs is bumped to 17.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 190575
2013-09-12 02:55:14 +00:00
Tom Stellard
de60e25278 R600: Fix incorrect LDS size calculation
GlobalAdderss nodes that appeared in more than one basic block were
being counted twice.

llvm-svn: 190078
2013-09-05 18:37:57 +00:00
Tom Stellard
35bb18c2a7 R600: Add support for vector local memory loads
llvm-svn: 189226
2013-08-26 15:06:04 +00:00
Tom Stellard
f3d166aa1e R600: Add support for i8 and i16 local memory stores
llvm-svn: 189223
2013-08-26 15:05:49 +00:00
Tom Stellard
2ffc330673 R600: Add support for v4i32 and v2i32 local stores
llvm-svn: 189222
2013-08-26 15:05:44 +00:00
Tom Stellard
fd155828ed SelectionDAG: Use correct pointer size when lowering function arguments v2
This adds minimal support to the SelectionDAG for handling address spaces
with different pointer sizes.  The SelectionDAG should now correctly
lower pointer function arguments to the correct size as well as generate
the correct code when lowering getelementptr.

This patch also updates the R600 DataLayout to use 32-bit pointers for
the local address space.

v2:
  - Add more helper functions to TargetLoweringBase
  - Use CHECK-LABEL for tests

llvm-svn: 189221
2013-08-26 15:05:36 +00:00
Tom Stellard
f6d8023ca4 R600: Remove unnecessary casts
Spotted by Bill Wendling.

llvm-svn: 188942
2013-08-21 22:14:17 +00:00
Tom Stellard
b249b75726 R600: Expand vector FRINT ops
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 188598
2013-08-16 23:51:33 +00:00
Tom Stellard
ad3aff246c R600: Expand vector FFLOOR ops
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 188597
2013-08-16 23:51:29 +00:00
Tom Stellard
a92ff87929 R600: Expand vector float operations for both SI and R600
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 188596
2013-08-16 23:51:24 +00:00