FILE README.TXT



               Realtime Pitch-shifting on ths GUS and GUSMAX


                                    by

                          H.R. Jiang & Y.C. Yang

                    Department of Electronics Engineering
                National Chiao Tung University, Taiwan, R.O.C




                            PSR  v1.5 (GUS)
                            PS16 v2.0 (GUSMAX)
                            PSG  v2.0 (GUSMAX with Graphics)




1. INTRODUCTION:

        Pitch shifting is an interesting research field in the DSP area.
    It used very little traditional DSP skill, but produce very amazing
    effect.  There are many karaoke machines that can do pitch-shifting
    of the music.

        We programed this interesting software because one of the author,
    H.R. Jiang has taken a 3-semester course in the NCTU EE, Taiwan.  The
    advisor of this course is Dr. S.G Chen.  H.R. Jiang was assigned this
    problem.

        Originally we wished to program on the TMS320c30 EVM, but the
    algorithm was too large to be put on the 16k words evironment.
    Therefore we tried to search for another programming evironment that
    met our need.  And then, suddenly, I thought of the GUSes in our PCs.
    So I sat down and began to ftp the SDK for the GUS.  In less than 4
    hours, a prototype of the realtime GUS driver was ready.

        Please feel free to tell us your response of the program by send
    an email to:

                u8211512@cc.nctu.edu.tw

    or send your postcard to:

        H.R. Jiang
        Room 313, Dorm Bamboo
        National Chiao Tung University.
        HsinChu, Taiwan.

    or

        Y.C. Yang
        Room 3115, Dorm 11
        National Chiao Tung University
        HsinChu, Taiwan.

        Remember, programmers need support, either by your mail or
    by your money. :)


2. ALGORITHM:

        Based on the thesis of G.J Lin.  Maybe we'll include the whole
    algorithm in the next release.


3. HOW TO RUN:

  a. Program Requirement

        + 386 or above. 486 or above is recommended.  FPU is optional.
        + 256k (or above) GUS or GUSMAX.
        + A device (tape deck, CD player or mic) connectted to the
          LINEIN, and a output device connectted to the output of the
          soundcard.
        + (If use psg.exe) a Mouse.


  b. Hardware Aspect

        First run the executable without any option.  It will show you
    how to run the program.  And there are some hardware configuration
    you must do before you really run the program.

        1. Run the program only at 386 or above.  Systems without FPU
           may induce problem during you press '+' or '-' to change
           pitch during runtime.

        (For GUS users only)

        2. Make sure that your right channel of the LINEIN is silent
           and the speaker or headphone connects to only the right
           channel of the LINEOUT or Amp-OUT.  If you need the help
           about this hardware configuration, please refer to the file
           HARDWARE.TXT.  It will show you how to build the component
           you might need.


  c. Software Aspect

        1. Make sure that you have initialized the GUS right.  Due
           to real-time requirement, please use two separate DMA
           channels for playback and record.  The way to setup your
           GUS (or GUSMAX) to two separate DMA channels is to modify
           the DOS's evironment variable "ULTRASND" and then run
           ultrinit.exe.  You can set the two channels to 16-bit
           DMA channels to speedup the program.

        2. Please make sure that there is no other GUS controlling
           software (such as MEGAEM, SBOS, ULTRAMID..) running.


  d. Options

        -f: Specify search method using FULL SEARCH.  This will
            take 3-4 times the computation time.  The result will
            be better than the default searching method.  On my
            486DX266 I can't use full search with 44.1k sample
            rate.

        --:
            Specify that the rest of the command-line are all
            parameters.  Use this option to specify a negative
            pitch.

        -c#:
            Specify the cost function you want.  Currently # is
            ranged from 0-5.

                #=0 : MAE (default)
                        MAE (Mean Absolute Error) is a typical cost
                        function to decide the distance between two
                        sequences.
                #=1 : unrolled MAE
                        Loop-unrolled version of #=0, lots of faster.
                #=2 : subsampled MAE
                        Subsampled version of #=0, fastest in the
                        6 choices, but it might degrade the output.
                #=3 : MSE
                        MSE (Mean Square Error) is another type of
                        cost function.
                #=4 : unrolled MSE
                        Loop-unrolled version of #=3.
                #=5 : subsampled MSE
                        Subsampled version of #=3, might degrade
                        the result.

        -b$$$$$:
            Specify your own buffer size.  Larger buffer size
            can induce longer delay, but smaller buffer size will
            have too much overhead and cause buffer overrun.  Too
            small buffer (less than 4096) will cause error.

            Smaller buffer size isn't always better.  The algorithm
            needs about 8kbyte pre-recorded buffer (depending on the
            sample rate).  Smaller buffers will produce more pre-cache
            buffers.  The only way to reduce delay is to reduce the
            sample rate.

        -p$$$$$:
            Specify precached buffer number. (NOT SIZE!).  It should
            be at least 0 when the buffer size is larger than 8k.
            Default is derived from the sample rate and the buffer
            size.

        -l: Larger searching range.  The result is not known,
            sometimes better, sometimes worse.  Try to tweak the
            option on your own.

        -m: Turn on MICIN.  (Not availiable on GUSMAX version.)


    e: Parameters

        Pitch:
                Pitch can be a number between +-12.0 semi-tones.
            (to specify a negative pitch, use "--" option).

        Frequency:
                The sample frequency in Herz.  Some soundcards
            have special limits on the sample frequencys.  For
            example the GUS can only sample at 44100Hz, 22050Hz
            and 11025Hz.  GUSMAX can sample at more different
            frequencies.  Use sample frequency not supported by
            the soundcard may cause some problem.


    f: During runtime

        (GUS users or GUSMAX users using ps16.exe)
            During runtime there will be a changing hex number in
        the screen.  It indicates the buffer location when the
        program run throught an iteration.  If this number is near
        the buffer size, it indicates that the CPU is nearly 100%-
        utilized.  Buffer size can be known by specify '-i' option.
        After you leave the program, it will show the maximum
        utilized percentage of the CPU.

            You can press '+' or '-' to change pitch interactively.
        Press the key ESC to quit.

        (GUSMAX users using psg.exe)
            You can use mouse to click on the buttons.

4. FILES:

        psr.exe         : Executable of GUS version.
        ps16.exe        : Executable of GUSMAX version, textmode interface.
        psg.exe         : Executable of GUSMAX version, graphics interface.
                          (mouse needed).
        nctu.jpg        : Our school mark (renderred by me using povray).
        readme.txt      : (this file)
        hardware.txt    : Some hardware information for GUS users.
        gusrun4.c       : The GUS realtime driver used in this release.
                          (Lousy coded, I know.  Any good idea please email.)



5. SPECIAL NOTES:

        a. If you specify -f (FULL-SEARCH) option, it will increase
           the computing time by 3-4 times.  I won't suggest you
           use full-search and run at 44100Hz unless you have a
           Pentium90/100.

        b. The program currently takes a 256k Ultrasound or a
           UltraMax.  Your PC must be 486DX33 minimum.  Upgrade
           your soundcard to 1MB on-board memory very good, but
           no useful to our program.  If you run the program with
           386, please specify a lower sample rate (like 22050Hz
           or 11025Hz).

        c. Sorry, GUS with 16-bit DB users cannot use the GUSMAX
           version.  The 16-bit DB cannot support sample and play-
           back at the same time.

        d. Remove the memory manager may speedup the program a
           little.  If your CPU really can't keep up with the
           program, Intel is selling 486/Pentium cheaply these
           days.



6. TROUBLE SHOOTING:

       Q0:  Why I can not use my 286?
       A0:  Well, we really hoped you can run the program with 286.
            But there are one critical function (also most time-
            consuming) that we can only optimize with 386 specific
            code.  Sorry guys, upgrade your machines!

       Q1:  I can't hear anything!
       A1:  First try to check if you really has anything on the
            LINEIN (or MICIN if you enable it in the option).  If you
            did has something on the input and you are not using GUSMAX,
            well, try swap your right channel with the left channel.
            Currently the GUS-version only samples the left channel of
            the LINEIN.  If you still can't hear the sound, please see
            next question.

       Q2:  I can only hear very faint sound at the output.
       A2:  There was a mis-print on the early released GUS boxes.
            They have made a mistake that the LINEIN was confused
            with the MICIN.  The LINE-IN is the stereo jack near
            the ISA bus edge-connector, not the jack near the MIDI/
            Joystick connector.

       Q3:  I got distorted output.
       A3:  There may be 2 reasons for the problem.  One reason is that
            you have specifed an illegal sample rate for the soundcard.
            For example, if you sampled in 8kHz using the GUS, it will
            produce strange result.  The second reason might be your
            input waveform is larger than the soundcard limit and caused
            clipping.  Try using a lower volume.

       Q4:  I can hear the sound, but it's not pitch shifted!
       A4:  Have you run the program with parameter "pitch" other
            than 0?  If you did, the problem may arize from the wrong
            connection of the input or output.  Try check twice.

       Q5:  The output is very noisy!
       A5:  You are using MICIN?  Or you are using AMPOUT other than
            the LINEOUT?  The MICIN and the AMPOUT are very noisy.
            Try changing your configuration with LINEIN and LINEOUT.

       Q5a: I didn't use MICIN nor AMPOUT, but there's lots of pops!
       A5a: One reason is that your music is not suitable for pitch-
            shifting.  Fast-beating sound (such as piano, drums..)
            will create lots of crackles because they just appear and
            dissappear too fast.  Voice is a better source for the
            input.  (String is too simple that might create wrackles.)

       Q6:  When the music is complex, it distorts a lot!
       A6:  Try using -f option to specify FULL-SEARCH method.  This
            may takes 3-4 times the computing power, so you might
            need to reduce the sample rate at the same time.

       Q7:  But there is still some random noise!
       A7:  We can't help with that kind of noise, that is algorithm
            dependant.  We'll try to improve the GUS driver, but it
            seems there are some major problems we cannot solve. We
            have included a driver source with the archive.  If you
            saw anything that can be improven, please email us.

       Q8:  What is the fastest configuration?
       A8:  Try "-c2" or "-c1", they are optimized versions of the
            default setting.

       Q9:  I can't specify a negative pitch.
       A9:  Use option "--" before the negative pitch.

       Q10: I've got everything right, but it said "buffer overrun"
            and quit now and then.
       A10: There are 4 small output buffers, each about 4kByte
            (according to the sample rate).  If your CPU cannot keep
            up with the incoming data, and run out the output buffer
            before new data generated, the program will tell you the
            buffer is running out. Try either to specify a lower
            sampling frequency or use a larger buffer (use -b$$$$$
            parameter).

       Q11: I want to specify more than +-12 semi-tones!
       A11: That's not the limit of the algorithm, just a limit in
            our implementation.  Well, if you can give us a reason
            that more than +-12 semi-tones is useful, we'll change
            that limit.

       Q12: Why the long delay between input and output?
       A12: We are currently working on the problem.  We hope it can
            be less than 200ms.  This requires a totally rewriting of
            the GUS driver or a change of the algorithm.

       Q13: Can I save the pitch-shifted sound?
       A13: No.  It is not very difficult to implement, but it will
            take too much CPU time at the runtime.  So we choose to
            include a separate tool in the future release that you
            can use it to pitch-shift your own sample file in the
            later.

       Q14: Hey, there are random spikes!  Your driver is broken!
       A14: Are you using GUS?  Maybe you can upgrade to a GUSMAX.
            We've tried the so-called seamless playback mentioned
            in the GUS-SDK.  But the timing drift problem of the GUS
            caused VERY BIG problem of us.  Using the seamless playback
            and record method can help with the spikes, but after some
            time the record process will lost sync with the playback
            process.  If you are interested in this problem, email for
            more information or source code.  We will be very glad to
            discuss with you on this problem.

       Q15: How did you program on the GUSMAX?  I just get crashed all
            the time.
       A15: Well, the GUSMAX (or the codec) routines in the GUS-SDK is
            broken.  Try get around the bugs.

       Q16: Hey, your program is lame!
       A16: Well, email me and let's discuss what's to improve.  A
            good software grows with time.

       Q17: Hey, this is great!  I want it running on my soundcard!
       A17: Humm.  How about upgrade to GUS or GUSMAX?


7. ACKNOWLEDGEMENT:

        S.G Chen.
        G.J Lin.



8. APPENDIX

   APPENDIX A: Sample rate effects.

        sample rate | process unit | min-precache size | min-delay
        ------------+--------------+-------------------+-----------
        44100Hz     | 4kByte.      | 8kByte            | 181ms
        22050Hz     | 2kByte.      | 4kByte            | 181ms
        11025Hz     | 1kByte.      | 2kByte            | 181ms



9. HISTORY:

====================
psr 1.0, 1994.Sep.12

  This is the very first release of our real-time pitch-shifting program.
This version is very solid, but its quality is too lousy and asked for a
lot of computing power.  So we called it a test release.  Anyway, when
we announce it on the netnews, we only received less than 10 posting
asked for the software.  :(  It's very dissappointing for us.

====================
psr 1.1, 1994.Sep.15

  Optimized inner loop.         (now can only run on 386 or above.)
  Revised GUS output driver.    (can you hear the difference?)
  One more search methods.      (full search of 44100Hz may take a Pentium90)
  More interactive controls.

====================
psr 1.2, 1994.Sep.19

  Modified hardware.txt         (red-BNC is the left channel.)

====================
psr 1.3, 1994.Sep.21

  Thanks for everyone who has responsed.  Your supports help a lot!
This time we have more modifications.

  Refined documents.
  Re-optimized inner loop.      (Only 2 cycles less. :( )
  Add -m option. (MICIN)        (Not sure if working).
  Add -i option.                (To see some internal parameter).
  Add -b option.                (To change the buffer size).
  Add -p option.                (To change precache buffer number).

  note: -b and -p options are for experienced users.

====================
psr 1.5,

  Lots of modifications.

