.:: Bots United ::.  
filebase forums discord server github wiki web
cubebot epodbot fritzbot gravebot grogbot hpbbot ivpbot jkbotti joebot
meanmod podbotmm racc rcbot realbot sandbot shrikebot soulfathermaps yapb

Go Back   .:: Bots United ::. > Developer's Farm > General Programming
General Programming Help others and get yourself helped here!

Reply
 
Thread Tools
c++ sse intrinstics
Old
  (#1)
mirv
Super Moderator
 
mirv's Avatar
 
Status: Offline
Posts: 152
Join Date: Apr 2004
Default c++ sse intrinstics - 29-06-2007

Anybody had much experience with sse intrinsics, specifically using visual studio? I've been poking at a (3d) vector class (more for entertainment than anything else), and it compiles & runs just fine under Linux, but can't seem to get it work under windows - it just crashes. I'm fairly certain I've set byte alignment correctly, and my cpu does support the instruction set.
So yeah, just curious if anyone's had something similar - it's kind of got me stumped.

-- Problem not solved, but I think it's related to heap allocation not being set on proper 16byte boundaries.


mirv the Silly Fish.

Last edited by mirv; 29-06-2007 at 17:40..
  
Reply With Quote
Re: c++ sse intrinstics
Old
  (#2)
The Storm
Council Member / E[POD]bot developer
 
The Storm's Avatar
 
Status: Offline
Posts: 1,620
Join Date: Jul 2004
Location: Bulgaria
Default Re: c++ sse intrinstics - 30-06-2007

Post the code here, I will try to debug.
  
Reply With Quote
Re: c++ sse intrinstics
Old
  (#3)
mirv
Super Moderator
 
mirv's Avatar
 
Status: Offline
Posts: 152
Join Date: Apr 2004
Default Re: c++ sse intrinstics - 30-06-2007

Code below - just don't expect it to be too neat, and beware the evil lack of commenting.
Kind of my first time mucking about with intrinsics as well, so I may actually be using them entirely incorrectly.

Code:
struct vector3D
{
  union {
    __m128 vector;
    struct { float x,y,z,w; };
  };
  vector3D() : x(0.0f), y(0.0f), z(0.0f), w(1.0f) {}
  vector3D(float inx, float iny, float inz) : x(inx), y(iny), z(inz), w(1.0f) {}
  vector3D(const float *inarray) : x(inarray[0]), y(inarray[1]), z(inarray[2]), w(1.0f) {}
  vector3D(__m128 val) : vector(val) {}

  inline const vector3D &operator=(const vector3D &src) { x = src.x; y = src.y; z = src.z; return *this; }
  inline const vector3D &operator=(const float *src) { x = src[0]; y = src[1]; z = src[2]; return *this; }
  inline const vector3D &operator+=(const vector3D &src) { vector = _mm_add_ps(vector, src.vector); return *this; }
  inline const vector3D &operator-=(const vector3D &src) { vector = _mm_sub_ps(vector, src.vector); return *this; }
  inline const vector3D &operator*=(float value) { x *= value; y *= value; z *= value; return *this;}
  inline const vector3D &operator/=(float value) { float inverse = 1.0f / value; x *= inverse; y *= inverse; z *= inverse; return *this;}

  inline float dot(const vector3D &invec) const { vector3D result(_mm_mul_ps(vector, invec.vector)); return (result.x + result.y + result.z); }
  inline void normalise() { vector3D result(_mm_mul_ps(vector, vector)); vector = _mm_div_ps(vector, _mm_set1_ps(sqrt(result.x + result.y + result.z))); }
  inline float length() const { vector3D result(_mm_mul_ps(vector, vector)); return sqrt(result.x + result.y + result.z); }
  inline float length_squared() const { vector3D result(_mm_mul_ps(vector, vector)); return (result.x + result.y + result.z); }
  inline float angle(const vector3D &in_vec) const { return acos((dot(in_vec) / (length_squared() * in_vec.length_squared()))); }
};

inline const vector3D operator+(const vector3D &a, const vector3D &b) { vector3D rvec(_mm_add_ps(a.vector, b.vector)); return rvec;}
inline const vector3D operator-(const vector3D &a, const vector3D &b) { vector3D rvec(_mm_sub_ps(a.vector, b.vector)); return rvec; }
inline const vector3D operator*(const vector3D &a, const vector3D &b) { vector3D rvec(a.y * b.z - a.z * b.y, a.z * b.x - a.x * b.z, a.x * b.y - a.y * b.x); return rvec; }
inline const vector3D operator*(float value, const vector3D &a) { vector3D rvec(_mm_mul_ps(_mm_set1_ps(value), a.vector)); return rvec; }
inline const vector3D operator*(const vector3D &a, float value) { vector3D rvec(_mm_mul_ps(_mm_set1_ps(value), a.vector)); return rvec; }
inline const vector3D operator/(const vector3D &a, float value) { vector3D rvec(_mm_div_ps(a.vector, _mm_set1_ps(value))); return rvec; }


mirv the Silly Fish.
  
Reply With Quote
Re: c++ sse intrinstics
Old
  (#4)
@$3.1415rin
Council Member, Author of JoeBOT
 
@$3.1415rin's Avatar
 
Status: Offline
Posts: 1,381
Join Date: Nov 2003
Location: Germany
Default Re: c++ sse intrinstics - 08-07-2007

I did once a bit wiht it, but it's written quite crappy: http://johannes.lampel.net/projects/nvec/

speedup can be seen for example at http://johannes.lampel.net/projects/bifurk/


  
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump



Powered by vBulletin® Version 3.8.2
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
vBulletin Skin developed by: vBStyles.com