.:: Bots United ::.

.:: Bots United ::. (http://forums.bots-united.com/index.php)
-   RACC (http://forums.bots-united.com/forumdisplay.php?f=11)
-   -   megaHAL enthusiasm (http://forums.bots-united.com/showthread.php?t=217)

Cheeseh 05-01-2004 16:51

megaHAL enthusiasm
 
Hi pierre. I'm hoping you know much about megaHAL.

I have been trying to get this megaHAL to work with my bot mostly from your source. I have got it working and everything so far, just a few questions on memory issues with megaHAL and bug type things I've noticed with it.

I've had a look in original megaHal source/docs/google about this but found nothing. Why does the Free/Empty dictionary function NOT free the actual words that were stored? This causes memory leaks unless they are freed elsewhere somehow, but I don't know where it could be (i.e. could it get copied into bot_model and stored until end?)

It depends also on how you implement it, I'm using dynamically allocated strings most of the time, I know with yours you use the message stored in the bot which is static.

The question is, is it safe to free all the strings in Empty_Dictionary function? Or are they supposed to mingle around in memory?! :D

Some bugs etc:

The input_words dictionary holds words that are from a message that the bot wants to learn from. The Make_Words function doesn't actually get a word, it gets a string from a point until the end of the whole message.
(this bit-->)
Code:

words->entry[words->size].word = input;
I have altered the function to put the proper word(s) into array positions. I will share it :)

Code:

void HAL_MakeWords (char *input, HAL_DICTIONARY *words)
{
  // this function breaks a string into an array of words
  int iLen;
  int iStart;
  int iEnd;

  char *szNewWord;
  int iNewWordLength;

  HAL_STRING *pNewWord;

  //int offset = 0;

  // clear the entries in the dictionary
  HAL_EmptyDictionary (words);

  if ( !input || !*input )
      return; // if void, return

  // re-written

  iLen = strlen(input);
  iStart = 0;
  iEnd = 0;

  while ( iStart < iLen )
  {
          if ( HAL_BoundaryExists(input,iEnd) || (iEnd == iLen) )
                  // If there is a new word to take in or at the end of the string
                  // then add the word to the dictionary
          {
                  // add the word to the dictionary
                  if (words->entry == NULL)
                          words->entry = (HAL_STRING *) malloc ((words->size + 1) * sizeof (HAL_STRING));
                  else
                          words->entry = (HAL_STRING *) realloc (words->entry,  (words->size + 1) * sizeof (HAL_STRING));

                  // get pointer to new word for quick access
                  pNewWord = &words->entry[words->size];
                  // work out new word length for new string
                  iNewWordLength = iEnd - iStart;

                  szNewWord = (char*)malloc(sizeof(char)*(iNewWordLength+1));

                  // copy word into string szNewWord
                  // word starts from position iStart until iEnd
                  // (goes on for iNewWordLength)
                  strncpy(szNewWord,&input[iStart],iNewWordLength);
                  szNewWord[iNewWordLength] = 0;

                  // update new word store
                  pNewWord->length = iNewWordLength;
                  pNewWord->word = szNewWord;

                  // increment number of words stored
                  words->size++;

                  iStart = iEnd;
          }

          iEnd++;
  }

// rest of function below except the add word stuff
...
...

important: This change also adds spaces as words, is this correct? I thought the boundary function would find that?

Another small problem: When you free the input_words at the end when the server deactivates, you sometimes encounter the word "." added in the MakeWords function, and you can't free it as it isn't malloc'd in MakeWords, and instead comes up with an assertion failure and doesn't free it.

Anyway. looking good, your code is all credited and also I'll need to release the source of my bot soon =)

Pierre-Marie Baty 05-01-2004 22:09

Re: megaHAL enthusiasm
 
first off you won't "need" to release the source of your bot since I'm using the BSD license which is a non-contaminative license (i.e you can do anything with the source code, including commercial or closed-source stuff).

The HAL_EmptyDictionary() does not free() the dictionary's memory space taken up by each of its words, it's just a speed hack. Instead of freeing all the stuff and slap malloc() calls everywhere again later, we use the realloc() handy function to reallocate each word. Since dictionaries are only altered to grow (not to shrink) no memory is leaked. An example of a dictionary can be,

"hello stupid bot"
(1)HELLO (2)STUPID (3)BOT
"you are a bot"
(1)HELLO (2)STUPID (3)BOT (4)YOU (5)ARE (6)A
"a bot is a stupid program"
(1)HELLO (2)STUPID (3)BOT (4)YOU (5)ARE (6)A (7)IS (8)PROGRAM

If you were to free each word and allocating it again after that, you would end up manipulating memory twice too much

Also, adding a trailing dot to the generated sentences was a bad idea : I removed it (that is now, the sentences output like they come). Yes, you can't free() that one. It was one of the first hacks I put into the MegaHAL code but I wasn't really knewing what I was doing then :D

Finally, I don't see why you add spaces as words : dictionaries are not supposed to contain spaces. And I don't quite understand the why of your modification to the HAL_MakeWords(). Do you really get something working with this ???:(

Here's my current HAL_MakeWords() function:
Code:

void HAL_MakeWords (char *input, HAL_DICTIONARY *words)
{
  // this function breaks a string into an array of words
  int offset = 0;
  // clear the entries in the dictionary
  HAL_EmptyDictionary (words);
  if (strlen (input) == 0)
          return; // if void, return
  // loop forever
  while (TRUE)
  {
          // if the current character is of the same type as the previous character, then include
          // it in the word. Otherwise, terminate the current word.
          if (HAL_BoundaryExists (input, offset))
          {
                // add the word to the dictionary
                if (words->entry == NULL)
                        words->entry = (HAL_STRING *) malloc ((words->size + 1) * sizeof (HAL_STRING));
                else
                        words->entry = (HAL_STRING *) realloc (words->entry,  (words->size + 1) * sizeof (HAL_STRING));
                if (words->entry == NULL)
                        TerminateOnError ("RACC: HAL_MakeWords() unable to reallocate dictionary\n");
                words->entry[words->size].length = (unsigned char) offset;
                words->entry[words->size].word = input;
                words->size += 1;
                if (offset == (int) strlen (input))
                        break;
                input += offset;
                offset = 0;
          }
          else
                offset++;
  }
  return; // finished, no need to add punctuation (it's an ACTION game, woohoo!)
}


Cheeseh 05-01-2004 22:20

Re: megaHAL enthusiasm
 
alright. Thanks for the info.

The thing I don't get with the Make words function above is, the fact you have an input string called "input" and you find words from it.

If you are trying to find a word, I understand the offset thing where it is incrementing the point from the start of the word you want, yeah? So you make a word equal a point in the input string, but what you actually stored is the whole string from a certain point, so it's not actually a word?

It's difficult to explain.

I basically needed to change it because of the memory, I don't know if the words need to be stored for later use, so I create a new word and keep it there.

so, do the words in input_words array need to given memory for use later in the program? Or can it be freed after using them in the HAL_Learn function for example.

Pierre-Marie Baty 05-01-2004 22:48

Re: megaHAL enthusiasm
 
"If you are trying to find a word, I understand the offset thing where it is incrementing the point from the start of the word you want, yeah? So you make a word equal a point in the input string, but what you actually stored is the whole string from a certain point, so it's not actually a word?"
Yes, that's exactly it. It's difficult to explain, and sometimes I'm not sure to understand it completely myself :)

There is no real memory leak, since it's always the same dictionary which is used again and again : it's the bot's "input_words" dictionary. When the first replies come, the dictionary is empty, so the words in it get mallocated(). THEN, as other replies come to the chat window, the words that were mallocated() get reallocated() and the other words that were NOT mallocated() yet are allocated. All this stuff is freed when the bot disconnects, and only there (at least it's so in the RACC template #2).

They may be memory leaks elsewhere, but this is not one of them. How did you do to notice that memory was leaking ?

Cheeseh 05-01-2004 23:03

Re: megaHAL enthusiasm
 
"How did you do to notice that memory was leaking"

I had problems when freeing the input_words dictionary after the bot disconnects. I got some assertion faults upon trying to free the strings other than the "." words. The strings looked as though they were already freed somehow. But that goes back to the problem of using the original input string for choosing the words, because the string wouldn't exist anymore.

I'm going to change it back to the way it was though and put the MakeWords function back. I just had a look at the code a bit more, and noticed that the AddWords function makes its own copy of a string so you dont need to keep a string stored. But the input_words doesn't do this <edit> because its just pointing at positions in a string in another part of memory in my code, so basically i dont have to free the strings in input_words after disconnecting, just the strings in bot_model dictionary </edit>

anyway I'll have a go and ask anything else when I need to then hehe.


All times are GMT +2. The time now is 08:08.

Powered by vBulletin® Version 3.8.2
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.