FinSeer

This directory is code for FinSeer.

The datasets used in our paper are uploaded in FinSeer_data.
Our retriever is uploaded in FinSeer.
Our fine-tuned stock llm is uploaded in StockLLM

Environments

# for baseline RAG models and retriever training
pip install InstructorEmbedding
pip install -U FlagEmbedding
pip install sentence-transformers==2.2.2
pip install protobuf==3.20.0
pip install yahoo-finance
python -m pip install -U angle-emb
pip install transformers==4.33.2  # UAE

train the retriever

step 1. get llm feedback scores (src/2_train_retriever/get_llm_feedback_scores.py)

dataset: acl18, bigdata22, stock23
target: the file to save, a json with llm probability scores

parser = argparse.ArgumentParser(description='test')
parser.add_argument('--dataset', default='acl18', type=str)
parser.add_argument('--target', default='acl18.scored.json', type=str)
args = parser.parse_args()
get_all_scores(llm='StockLLM')  # llama family are all supported for this code

step 2. select positive and negative candidates (src/2_train_retriever/select_positive_and_combine.py)

Before the step, you should have generated acl18.scored.json, bigdata22.scored.json, stock23.scored.json

This step is to select candidates for all three datasets, and generate a combined train.scored.json file.

Then, follow the steps in this link, you can finetune your own 'FinSeer' using train.scored.json data.

predict stock movement with RAG model

step 1. get embeddings of queries and candidates (src/3_stock_movement_prediction/get_embeddings.py)

q_or_c: query or candidate, we generate the embeddings of query sequences and candidate sequences separately

parser = argparse.ArgumentParser(description='test')
parser.add_argument('--test_dataset', default='bigdata22', type=str)
parser.add_argument('--embedding_model', default='e5',
                    choices=['instructor', 'uae', 'bge', 'llm_embedder', 'e5', 'FinSeer'])
parser.add_argument('--q_or_c', default='candidate')
args = parser.parse_args()

step 2. calculate similarity of query and qualified candidates (and get top-5 related candidates)

step 3. predict stock movement with or without retrieval That is what our three files do,

1_no_retrieval.py
2_random_retrieval.py
3_similarity_retrieval.py

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FinSeer

Environments

train the retriever

predict stock movement with RAG model

About

Releases

Packages

Languages

The-FinAI/FinSeer

Folders and files

Latest commit

History

Repository files navigation

FinSeer

Environments

train the retriever

predict stock movement with RAG model

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages