How to make agents understand the environment descriptions and act accordingly? [Spoiler Alert] Language-conditioned world model improves policy generalization by reading environmental descriptions